Deep learning has been a very hot topic lately. As part of my OMSCS Big Data for Healthcare class and PhD preparation, it seems like I also need to learn about “Deep Learning”. I did watch the Deep Learning videos from Udacity and I honestly believe that those videos are more than enough to give one an overview of Deep Learning. But to do more meaningful work in this area, I need a deeper understanding of “Deep Learning”. Hence, I started reading the popular “deep learning book“. Below is my internalization of chapter 1. Note that aside from my opinions, most of the contents below is just me retelling the contents of the book.
Looking at the categories of bodies of knowledge, Deep Learning is basically under Representation Learning which is under Machine Learning which is under Artificial Intelligence. From the data, representation learning learns simple representations of the big problem and combining these representations to make more accurate predictions.
3 Notable Phases of Deep Learning History:
- Cybernetics (1940s-1960s)
- Linear models were created, along with the discovery of its limitations, such as being unable to solve XORs.
- Connectionism (1980s-1990s)
- One main idea of connectionism is that a large number of computational units can achieve intelligence behavior when networked together (as inspired by our brain and the network of neurons it contains)
- Distributed representation – the idea that each input of a system should be represented by many features, and each feature should be involved in the representation of many possible inputs (reading the example in the book will make things clearer )
- Deep learning (2006-present)
Two neural perspectives for deep learning
- the brain is a living example that intelligent behavior is possible, and a straightforward way to build intelligence is to reverse engineer the brain (which is easier said than done)
- assuming that machine learning models encapsulate a part of how our brain works, it becomes useful in shedding light to understanding the brain and the underlying principles of human intelligence.
In recent years, there are a lot of improvements in the field due to:
- Faster computers
- More data
- the models did not change much compared to the 1980s, what changed was the amount of data we used to train the models.
- Rough rule of thumb as of 2016:
- 5000 labeled examples per category = acceptable performance.
- 10 million labeled examples = match or exceed human performance.
- New techniques to enable deeper networks
- We have more computational resources to run much larger models today. Model size increases 2x every roughly 2.4 years.
- If we continue this track, we will probably reach the same number of neurons as humans by 2050s. Although a biological neuron may be more complicated than a computer neuron so doing an apple to apple comparison might be wrong.
As data grows and as the expertise of AI increases, I believe it is important to think about how will this affect certain areas of my life and how should I act now in preparation for the future.