I haven’t written much over the last few weeks because I’ve been really busy. As someone without substantial work experience in the field of machine learning, I need to come up with ways to prove to potential employers and/or customers that I am capable of solving real world problems. One of the best ways I see to achieve this goal is to perform well in online machine learning competitions with prizes. The site where most such competitions are held is Kaggle.
On Kaggle, some of the best professionals from around the world compete for substantial monetary prizes provided by big companies and organizations interested in finding solutions to the machine learning challenges that they face. The competition is quite fierce, and in order to place near the top of the leaderboard it is pretty much necessary to use state-of-the-art techniques.
I first found out about Kaggle from participating in the Vancouver Learn Data Science meetup. In general, I find meetups to be are a great way to meet industry professionals and expand my horizons. This meetup has a regular schedule of presentations specifically devoted to reviewing recently finished Kaggle competitions and the techniques used by the winners. After attending a few of these meetings, a couple of weeks ago I gave my own presentation. While preparing for this presentation, I have discovered a lot of new ideas related to modern machine learning techniques, so it was a very valuable experience.
Another awesome feature of these meetups is that the attendees can find other people to team up with for competitions (Kaggle allows up to 5 people to form a team for each challenge). A few weeks ago, I teamed up with another guy, who is also new to machine learning competitions, for the Data Science Bowl challenge, which started in January and has a deadline of April 11. The goal of the challenge is to automatically detect cell nuclei in medical images.
We are definitely at a disadvantage in this contest, as we only started working on the problem 2/3 of the way into it and we don’t even have a GPU (graphics processing unit, which is the type of chip used for fast training of deep neural networks) to train our convolutional network on, so it takes us a couple of days (!) to fully train each model on a regular CPU. Still, one week before the deadline our public leaderboard score places us in the top 10% of the contestants, which, if it holds, would give us a so-called ‘bronze medal’ on Kaggle and, given the circumstances, would be a great result. In the next competition I will be going for the win though!
I am planning to spend most of my time working on this challenge until next Wednesday. Once I am done with it, hopefully I will have more time to blog and explore other projects. In any case, I have already learned a lot from this experience. I would definitely recommend Kaggle to all people looking to switch into machine learning as a way to learn modern practical techniques and prove their qualifications competing for money against world-class professionals.