Engineering and Technology Colloquia Series

On Winning a Kaggle Prize with the R Programming Language

Presented by: Chris Raimondi (n/a)
Category: Engineering Colloquia   Duration: 1 hour   Broadcast date: November 01, 2012
Kaggle is a data mining competition website and platform that has hosted dozens of competitions with real world data. These competitions, usually for cash prizes, have pitted data scientists around the world to see who can best predict such things as who will win a chess game, what job postings will get responses on careerbuilder.com, and when a shopper will return to a grocery store. Chris, with no formal education in computer science or machine learning, used R to win the first competition he entered which was to predict which patients would have an improvement in their HIV infection based off of genetic data and lab tests. R is used by a large portion of Kaggle competitors and Chris will go over some of the machine learning methods he has used in the Kaggle competitions. Hear how Chris has leveraged the competitive nature and diversity of these competitions to teach himself R and learn about his experiences as well as pick up some tricks and tips so that you too can learn R and machine learning. Chris is currently ranked in the top 2% of competitors for the $3,000,000 Heritage Health Prize and is one of only eight individuals to have held the top spot for that competition in the last 18 months. At the end of this talk you will have a good idea how data mining competitions work as well as a practical roadmap to get started learning R yourself and enter your first competition.

