Thanks Radek 7th place solution to HWI 2019 competition

Original solution.

  • Center loss. It allowed to train longer w/o severe overfitting. Basic ResNet50, trained for 32+64 epochs on 384x384 grayscale 0.813 lb
Center loss implementation in Sorry for screenshot, I will publish a repo soon
  • Add temperature scaling before softmax 0.834 lb. It is simple coefficient to multiply logits, found on validation set. For me 2.2 worked well.
  • Train on public bboxes 0.872 lb
  • Switch to ResNet152 0.877 lb. But ResNet152 is unstable and slow :(. So I continued experiments with ResNet50
  • Add 1-NN distance classifier by pre-last L2Norm(Linear(2048)) features. I transformed it to similarity by (2-distance)/2 and merged with classifier predictions. New whale is reperesented by another threshold, as well as nearest image from “new_whale” class.
    ResNet50 0.883, ResNet152 0.899 lb, Ensemble ResNet50 + ResNet152 0.904 lb. Train 100+100 epochs
  • ResNet50 with 4 heads, EnsembleNet-like, each of them is 2048 features (4*2048 after concat) and pooling 0.897 lb. Center is applied to each head.
Figure from Ensemble Feature for Person Re-Identification
  • SEResNeXt50 0.901 lb
  • Decrease each head dimensionality from 2048 to 512. This hurt softmax classifier, but 1-NN on features receives huge boost. So final concat is 2048. — 0.920 lb
  • I was wondering, why center loss helps so much, if it is basically another (random) classifier + feature norm constraint. May be it is enough to just constrain feature norm? It turned out, that yes. And it was already invented by name of Ring Loss. 0.934 lb
Ring Loss implementation
  • Change backbone to VGG16-BN0.942 lb
  • Change pooling to constant GeM(3.74) pooling — 0.944 lb.
  • Inception v4–0.925 lb
  • DenseNet121–0.945 lb
  • VGG16-BN — 0.944 lb
  • TTA. It is really sad :(
  • ArcFace, CosFace, *Face, LGM losses
  • Focal loss
  • BCE loss
  • Making any use of “new_whale” images, except for 1-NN prediciton
  • oversampling
  • pseudo-labeling
  • RandomErase aka CutOut augmentation, RGB or RGB2Gray augmentation. Default augmentation w/o flipping worked the best
  • Contour detection
  • mixup




Computer Vision researcher and consultant. Co-founder of Ukrainian Research group “Szkocka”.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Let's make Decision Trees

The crux of NLU is managing the long-tail of data.

Tuning, adjusting, trial and error

Building a Classifier from Census Data

Mask Detection with CNN and OpenCV


A quick review on “Listen, Attend and Spell” paper

10 Ways to Optimize Text for Machine Translation

Steps to build the Machine Learning Models and Evaluation of the Machine Learning Models

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Dmytro Mishkin

Dmytro Mishkin

Computer Vision researcher and consultant. Co-founder of Ukrainian Research group “Szkocka”.

More from Medium

Flower Image Classifier Using Neural Networks

AI and Deep Learning: A Guide to What It Is, What It Does, And How To Get Started?

Applications of Deep Learning in medical disorder detection

Training Time Prediction of deep learning applications in the cloud