Deep Learning, an Overview
Deep learning has been around for quite some time now and has become a buzzword in the IT industry. By definition, deep learning is a new area of machine learning that focuses on algorithms inspired by the structure and function of the brain called neural networks. Certain applications such as speech recognition, language processing, and image recognition can be achieved from deep learning.
With so many startups and large corporations showing interest in the field, it’s no surprise that tech giants such as Google, Microsoft and Facebook are in the forefront of this emerging industry. In this post, we’ll discuss:
- The technical details that drive deep learning
- Why deep learning is so popular
- Deep learning applications
- Deep learning training vs inference
While deep learning can be defined in many ways, a very simple definition would be that it’s a branch of machine learning in which the models (typically neural networks) are graphed like “deep” structures with multiple layers. Deep learning is used to learn features & patterns that best represent data. It works in a hierarchical way: the top layers learn high level generic features such as edges, and the low level layers learn more data specific features. The process of deep learning can be applied to various applications including image classification, text classification, speech recognition, and predicting time series data. We’ll focus on several aspects of deep learning in this blog post.
Deep Learning For Image Classification
Deep learning is very popular in computer vision due to its remarkable performance on image classification problems. The Convolutional Neural Networks (CNNs), used primarily for image classification, have been around for decades but gained popularity only recently. The reason is that the computational power required to train these networks was unavailable in the nineties era, but with the help of today’s sophisticated hardware, training these networks is relatively easier.
Deep learning for image classification is so popular that there’s a competition called ImageNet Large Scale Visual Recognition Challenge (ILSVRC) held every year. Researchers from all over the world, including tech industry titans like Google and Microsoft, participate to win the challenge. ImageNet is a large database of images with over 1000 categories (classes) containing around 10 million labeled images and researchers develop their own networks (CNNs) to achieve highest accuracy to win the competition. The winner of 2012 ImageNet challenge was AlexNet designed by Alex Krizhevsky. GoogLeNet by Google and ResNet by Microsoft won the 2014 and 2015 challenges respectively. While AlexNet was a relatively simpler network with only 8 layers, GoogLeNet and ResNet are more complex and deep networks evolved over the years, validating the idea that the deeper the network is, the better it is.
How is Deep Learning Different From Machine Learning?
What is the difference between regular machine learning techniques and deep learning? For typical machine learning, we can use any classifier, i.e. a support vector machine (SVM) or even a neural network for training, but how would we feed the data to the classifier? We can use image pixel intensities where each pixel will be a feature, train our classifier, and then use it for prediction. Even then, it’s a long debate whether pixel intensities are good enough to be used as features, and how well they would perform under different lightning conditions. For decades, researchers have been handcrafting features to train these classifiers. You can even use algorithms like SIFT, SURF and HOG that are proven to be good feature descriptors and compare your performance. This is called Feature Engineering and this is what differentiates general machine learning from deep learning.
Deep learning on the other hand automatically learns the appropriate features that best describe your data. Convolution layers in a CNN are responsible for learning those features, and you can have a linear classifier like an SVM or a SoftMax at the end to classify those features. Both of these things are integral parts of CNN.
Challenges Encountered in Deep Learning
Running deep learning applications yields a couple of key challenges:
- It is computationally expensive
- It requires sophisticated hardware to run
While you can run some basic models on your personal computer using only the CPU, you’ll need more processing power from hardware such as graphics processing units (GPUs) to train more complex networks that would otherwise take days to train on a CPU. GPUs are more powerful and scalable for a larger amount of parallel floating point computations and are often used in powering deep learning neural nets as an alternative to CPUs.
At Exxact, we provide deep learning solutions that are powered by leading hardware, software, and systems engineering so that our customers do not have to deal with any setup challenges. We cater to various different industries and work with our customers to provide the best possible solution that fits within their budget.
We’ve discussed some technical details of deep learning and its applications specifically in computer vision. Deep learning has gained popularity and it’s here to stay at least for the next decade to come. This might be a really good time to get your hands dirty with it. There’s a lot of research going on and this is evident from the fact that Google and Microsoft have open sourced their deep learning libraries, TensorFlow and CNTK respectively. Google’s self driving cars are powered by deep learning, and it’s not far until we see autonomous vehicles and drones becoming standard. While this article is just a start, you can continue to explore stuff like image classification and localization, segmentation, multiple object detection and techniques like RCNNs, Faster RCNNs and YOLO to dive deeper into the deep learning world.
Interested in Exxact’s deep learning solutions? Click the banner below to shop our products: