Decomposition: common ingredient of deep learning progress

1 min readSep 11, 2017

--

What the most successful deep learning papers have in common? Lets look at them.

Matrix factorization image from https://www.cs.cmu.edu/~yuxiangw/research.html

Deep learning itself: F(x) -> f1(f2(…(fn(x))..))
Convolution neural networks: Full image-size filters -> patch-size filters.
VGGNet: 5x5 -> 3x3 +3x3
Separable convolution: 3x3 -> 1x3 + 3x1
ResNet: F(x) -> x + f(x), |f|<|F|.
XCeption, ResNeXt 3x3x32 -> 8 groups by 3x3x4
BatchNorm: F(x) -> Scale*(F(x)/|F(x)|) +bias
Weight normalization: Weights -> Scale * unit-norm-direction

So, if you want to make learning easier, decompose it as much as you can. If you cannot, try to reparametrize.

Machine Learning

Written by Dmytro Mishkin

Computer Vision researcher and consultant. Co-founder of Ukrainian Research group “Szkocka”.

No responses yet

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams