[Notes] iMet Collection 2019 - FGVC6 (Part 1)

Photo Credit Overview Preamble I started doing this competition (iMet Collection 2019 - FGVC6) seriously after hitting a wall doing the Freesound competition. It was really late (only about one week until the competition ends), but by re-using a lot of code from the Freesound competition and using Kaggle Kernels to train models, I managed to get a decent submission with F2 score of 0.622 on the private leaderboard (the top 1 solution got 0.672, but used a hell lot more resources to train). ...

July 16, 2019 · Ceshine Lee

Dealing with Synthetic Data

Photo Credit Overview Kaggle recently hosted a competition (Instant Gratification) to test their new “synchronous Kernel-only competition” format. It features a synthetic dataset, and the best way to achieve high score on this dataset is to reverse-engineer the dataset creation algorithm. I did not really spend time into this competition, but after the competition was over I went back checked the discussion forum for solutions and insights shared, and found it actually quite interesting. There are quite a few of lessons to be learned about how to create or deal with synthetic data. ...

June 25, 2019 · Ceshine Lee

Smaller Docker Image using Multi-Stage Build

Photo Credit Why Use Mutli-Stage Build? Starting from Docker 17.05, users can utilize this new “multi-stage build” feature [1] to simplify their workflow and make the final Docker images smaller. It basically streamlines the “Builder pattern”, which means using a “builder” image to build the binary files, and copying those binary files to another runtime/production image. Despite being an interpreted programming language, many of Python libraries, especially the ones doing scientific computing and machine learning, are built upon pieces written in compiled languages (mostly C/C++). Therefore, the “Builder pattern” can still be applied. ...

June 21, 2019 · Ceshine Lee

Mixed Precision Training on Tesla T4 and P100

Photo Credit tl;dr: the power of Tensor Cores is real. Also, make sure the CPU does not become the bottleneck. Motivation I’ve written about Apex in this previous post: Use NVIDIA Apex for Easy Mixed Precision Training in PyTorch. At that time I only have my GTX 1070 to experiment on. And as we’ve learned in that post, pre-Volta nVidia cards does not benefit from half-precision arithmetic in terms of speed. It only saves some GPU memory. Therefore, I wasn’t able to personally evaluate how much speed boost we can get from mixed precision with Tensor Cores. ...

June 13, 2019 · Ceshine Lee

[Notes] SHAP Values

Photo Credit Unlike other feature importance measures, SHAP values are fairly complicated and theoretically grounded. I kept forgetting the small details of how SHAP values works. These notes aim for making sure I understand the concept well enough and be something that I can refer back to once in a while. Hopefully it will also be helpful to you. Classic Shapley Value Estimation Shapley regression values: $$\phi_{i} = \sum_{S \subset F \backslash \{i\}} \frac{|S|!(|F|-|S|-1)!}{|F|!}[f_{S \cup \{i\}}(x_{S \cup \{i\}}) - f_S(x_S)]$$ Shapley regression values are feature importances for linear models in the presence of multicollinearity. [1] ...

May 23, 2019 · Ceshine Lee