Decomposition: common ingredient of deep learning progress

Dmytro Mishkin
1 min readSep 11, 2017


What the most successful deep learning papers have in common? Lets look at them.

Matrix factorization image from
  1. Deep learning itself: F(x) -> f1(f2(…(fn(x))..))
  2. Convolution neural networks: Full image-size filters -> patch-size filters.
  3. VGGNet: 5x5 -> 3x3 +3x3
  4. Separable convolution: 3x3 -> 1x3 + 3x1
  5. ResNet: F(x) -> x + f(x), |f|<|F|.
  6. XCeption, ResNeXt 3x3x32 -> 8 groups by 3x3x4
  7. BatchNorm: F(x) -> Scale*(F(x)/|F(x)|) +bias
  8. Weight normalization: Weights -> Scale * unit-norm-direction

So, if you want to make learning easier, decompose it as much as you can. If you cannot, try to reparametrize.



Dmytro Mishkin
Dmytro Mishkin

Written by Dmytro Mishkin

Computer Vision researcher and consultant. Co-founder of Ukrainian Research group “Szkocka”.

No responses yet