Decomposition: common ingredient of deep learning progress
1 min readSep 11, 2017
What the most successful deep learning papers have in common? Lets look at them.
- Deep learning itself: F(x) -> f1(f2(…(fn(x))..))
- Convolution neural networks: Full image-size filters -> patch-size filters.
- VGGNet: 5x5 -> 3x3 +3x3
- Separable convolution: 3x3 -> 1x3 + 3x1
- ResNet: F(x) -> x + f(x), |f|<|F|.
- XCeption, ResNeXt 3x3x32 -> 8 groups by 3x3x4
- BatchNorm: F(x) -> Scale*(F(x)/|F(x)|) +bias
- Weight normalization: Weights -> Scale * unit-norm-direction
So, if you want to make learning easier, decompose it as much as you can. If you cannot, try to reparametrize.