Last weekend, I participated in the "Distracted MNIST" challenge on Kaggle, which I found the problem really interesting. It's a variation of the original MNIST digit recognition including a new class: non-digits. But what's the catch? The training set contains only digits, whereas other symbols and characters are present in the test set.

Although I initially tried two "happy ideas" whose results were a little bit deceiving and didn't meet my expectations, I ended up generating synthetic data for the new label, which got me almost a 95% accuracy (and the first place!)

You can find a notebook containing my solution to this problem including the implementation by following this link.

In the following image, some example predictions are displayed. Please note that label 10 means "non-digit", not "digit 10".

Sample digit predictions

I definitely enjoyed participating and, as always, Kaggle's scoring system made it even more exciting. Congratulations to the organizers and the rest of the participants!