Brief outline below (more of personal guide actually): Read from link.

- Convolution Operation Description
- Cross Correlation

- Why Convolution?
- Sparse Interaction
- Parameter Sharing
- Equivariant Representation

- Conv Nets Operation
- Convolution
- Detector (Nonlinear Function)
- Pooling – adding strong prior that the function the layer learns must be invariant to small translations.

- Convolution may imply an infinitely strong prior that weights is shared among neighbors and that far edges have 0 weights. This prior makes sense if the feature is equivariant to translation.
- Variants of Convolution
- 1 kernel = 1 kind of feature. Usually use many kinds of kernel.
- downsampling (stride)
- border – zero padding
- valid convolution
- same convolution
- full convolution

- locally connected layers / unshared convolution
- tiled convolution

- Structured Output
- classification
- Tensor

- Data Types – can process inputs of varying spatial extents (contains varying amount of observation of the same kind of thing, not optionally contain varying amounts of observation)
- Efficient convolution algorithms – If the kernel is “separable”, a much more efficient approach can be used.
- We can use the following to train our convolutional network
- Random
- Greedy layer wise pre-training
- Unsupervised learning

- Neuroscience basic for conv nets
- Gabor Functions

- History – In a way, conv nets paved the way to the general acceptance of neural networks.