Before beginning a feature comparison between TensorFlow vs PyTorch vs Keras, let’s cover some soft, non-competitive differences between them.
Below we present some differences between the 3 that should serve as an introduction to TensorFlow vs PyTorch vs Keras. These differences aren’t written in the spirit of comparing one with the other but with a spirit of introducing the subject of our discussion in this article.
- Created by Google
- version 1.0 in February, 2017
- Created by Facebook
- Version 1.0 in October, 2018
- Based on Torch, another deep learning framework based on Lua
- High level API to simplify the complexity of deep learning frameworks
- Runs on top of other deep learning APIs — TensorFlow, Theano and CNTK
- It is not a library on its own
Competitive differences of TensorFlow vs PyTorch vs Keras:
Now let’s bring the more competitive facts about the 3 of them. We are specifically looking to do a comparative analysis of the frameworks focusing on Natural Language Processing.
1. Types of RNNs available in both
When looking for a Deep Learning solution to an NLP problem, Recurrent Neural Networks (RNNs) are the most popular go-to architecture for the developers. Therefore, it makes sense to compare the frameworks from this perspective.
All of the frameworks under consideration, have modules that allow us to create simple RNNs as well as their more evolved variants — Gated Recurrent Units (GRU) and Long Short Term Memory (LSTM) networks.
PyTorch provides 2 levels of classes for building such recurrent networks:
- Multi-layer classes — nn.RNN , nn.GRU andnn.LSTM
Objects of these classes are capable of representing deep bidirectional recurrent neural networks.
- Cell-level classes — nn.RNNCell , nn.GRUCell and nn.LSTMCell
Objects of these classes can represent only a single cell (again, a simple RNN or LSTM or GRU cell) that can handle one timestep of the input data.
So, the multi-layer classes are like a nice wrapper to the cell-level classes for the time when we don’t want much customization within our neural network.
Also, making an RNN bidirection is as simple as setting the bidirectional argument to True in the multi-layer classes!
Check out our article — Getting Started with NLP using the PyTorch framework — to dive into more details on these classes.
TensorFlow provides us with a tf.nn.rnn_cell module to help us with our standard RNN needs.
Some of the most important classes in the
tf.nn.rnn_cell module are as follows:
- Cell level classes which are used to define a single cell of the RNN , viz —
- MultiRNNCell class which is used to stack the various cells to create deep RNNs
- DropoutWrapper class which is used to implement dropout regularization
Check out our article — Getting Started with NLP using the TensorFlow and Keras framework — to dive into more details of this module.
Below are the recurrent layers provided in the Keras library. These layers include:
- SimpleRNN — Fully-connected RNN where the output is to be fed back to input
- GRU — Gated Recurrent Unit layer
- LSTM — Long Short Term Memory layer
Check out our article — Getting Started with NLP using the TensorFlow and Keras framework — to dive into more details on these classes.
All 3 of TensorFlow, PyTorch and Keras have built-in capabilities to allow us to create popular RNN architectures. The difference lies in their interface.
Keras has a simple interface with a small list of well-defined parameters, makes the above classes easy to implement. Being a high-level API on top of TensorFlow, we can say that Keras makes TensorFlow easy. While PyTorch provides a similar level of flexibility as TensorFlow, it has a much cleaner interface.
While we are on the subject, let’s dive deeper into a comparative study based on the ease of use for each framework.
2. Ease of use TensorFlow vs PyTorch vs Keras
TensorFlow is often reprimanded over its incomprehensive API. PyTorch is way more friendly and simpler to use. Overall, the PyTorch framework is more tightly integrated with Python language and feels more native most of the times. When you write in TensorFlow sometimes you feel that your model is behind a brick wall with several tiny holes to communicate over. And that’s why, Keras.
Let’s discuss a few more factors comparing the three, based on their ease of use:
Static computational graphs vs dynamic computational graphs:
This factor is especially important in NLP. TensorFlow uses static graphs for computation while PyTorch uses dynamic computation graphs.
This means that in Tensorflow, you define the computation graph statically, before a model is run. All communication with outer world is performed via tf.Session object and tf.Placeholder which are tensors that will be substituted by external data at runtime.
In PyTorch things are way more imperative and dynamic: you can define, change and execute nodes as you go, no special session interfaces or placeholders.
In RNNs, with static graphs, the input sequence length will stay constant. This means that if you develop a sentiment analysis model for English sentences you must fix the sentence length to some maximum value and pad all smaller sequences with zeros. Not too convenient, right?
Since computation graph in PyTorch is defined at runtime you can use our favorite Python debugging tools such as pdb, ipdb, PyCharm debugger or old trusty print statements.
This is not the case with TensorFlow. You have an option to use a special tool called tfdbg which allows to evaluate TensorFlow expressions at runtime and browse all tensors and operations in session scope. Of course, you won’t be able to debug any python code with it, so it will be necessary to use pdb separately.
- Community size:
Tensorflow is more mature than PyTorch. It has a much larger community as compared to PyTorch and Keras combined. Its user base is growing faster than both PyTorch and Keras.
So this means —
- A larger StackOverFlow community to help with your problems
- A larger set of online study materials — blogs, videos, courses etc.
- Faster adoption for latest Deep Learning techniques
Future of NLP:
While Recurrent Neural Networks have been the “go-to” architecture for NLP tasks for a while now, its probably not going to be this way forever. We already have a newer Transformer model, based on the Attention mechanism, gaining popularity amongst the researchers.
It is already being hailed as the new NLP standard, replacing Recurrent Neural Networks. Some commentators believe that the Transformer will become the dominant NLP deep learning architecture of 2019.
Check out our comprehensive 3-part tutorial to get started with Transformers.
Tensorflow seems to be ahead in this race:
- First of all, Attention based architectures were introduced by Google itself.
- Second, only TensorFlow has a stable release for Transformer architecture
This is not to say that PyTorch is far behind, many pre-trained transformer models are available at Huggingface’s https://github.com/huggingface/pytorch-transformers.
So, that’s all about the comparison. But before parting ways, let me tell you about something that might make this whole conversation obsolete in 1 year!
Google recently announced Tensorflow 2.0 and it is a game-changer!
- Going forward, Keras will be the high level API for TensorFlow and it’s extended so that you can use all the advanced features of TensorFlow directly from tf.keras. So, all of TensorFlow with Keras simplicity at every scale and with all hardware.
- In TensorFlow 2.0, eager execution is now the default. You can take advantage of graphs even in eager context, which makes your debugging and prototyping easy, while the TensorFlow runtime takes care of performance and scaling under the hood.
- TensorBoard integration with Keras, which is now a… one-liner!
So, that mitigates almost all the complains that people have about TensorFlow, I guess. Which means that TensorFlow will consolidate its position as the go-to framework for all deep learning tasks, even better now!