Samiksha Jaiswal (Editor)

Kaggle

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit
Industry
  
Data science

Founder
  
Anthony Goldbloom

Headquarters
  
San Francisco

Website
  
www.kaggle.com

Founded
  
April 2010

Parent organization
  
Google

Kaggle httpswwwkagglecomcontentv4e5085eca1abkagg

Key people
  
Anthony Goldbloom (CEO) Ben Hamner (CTO) Jeff Moser (Chief Architect)

Products
  
Competitions, Jobs Board, Kaggle Scripts, Kaggle Datasets

Services
  
predictive modeling competitions, hosted public datasets

CEO
  
Anthony Goldbloom (Apr 2010–)

Motto
  
Your Home for Data Science

Profiles

Saad lamouri kaggle tutorial with r introduction to data science


In 2010, Kaggle was founded as a platform for predictive modelling and analytics competitions on which companies and researchers post their data and statisticians and data miners from all over the world compete to produce the best models. This crowdsourcing approach relies on the fact that there are countless strategies that can be applied to any predictive modelling task and it is impossible to know at the outset which technique or analyst will be most effective. Kaggle also hosts recruiting competitions in which data scientists compete for a chance to interview at leading data science companies like Facebook, Winton Capital, and Walmart.

Contents

In April 2015, Kaggle released the first version of their Scripts product onto their platform. Scripts allows users to write, run, and publicly share their code on Kaggle. In January 2016, Kaggle released their Datasets product, making a selection of public datasets available on Kaggle. Each datasets has Scripts enabled, as well as a dedicated forum, allowing for conversation and collaboration between data scientists and the work they are doing on each dataset. On 8 July 2016, Kaggle renamed its Scripts product to Kernels. On 8 March 2017, Google announced that they were acquiring Kaggle. They will join the Google Cloud team and continue to be a distinct brand.

How to do the titanic kaggle competition in r part 1


Kaggle community

As of May 2016, Kaggle had over 536,000 registered users, or Kagglers. The community spans 194 countries. It is the largest and most diverse data community in the world. Kagglers come from a wide variety of backgrounds, including fields such as computer science, computer vision, biology, medicine, and even glaciology. Kaggle competitions regularly attract over a thousand teams and individuals. The Kaggle community is active and committed, with 4,000 forum posts per month and over 3,500 competition submissions per day. It also includes many of the world’s best-known researchers, including members of IBM Watson’s Jeopardy-winning team and the team working on Google’s DeepMind. Many of these researchers publish papers in peer-reviewed journals based on their performance in Kaggle competitions.

How Kaggle competitions work

  1. The competition host prepares the data and a description of the problem. Kaggle offers a consulting service which can help the host do this, as well as frame the competition, anonymize the data, and integrate the winning model into their operations.
  2. Participants experiment with different techniques and compete against each other to produce the best models. Work is shared publicly through Kaggle Scripts to achieve a better benchmark and to inspire new ideas. Submissions are made through Scripts or through private manual upload. For most competitions, submissions are scored immediately (based on their predictive accuracy relative to a hidden solution file) and summarized on a live leaderboard.
  3. After the deadline passes, the competition host pays the prize money in exchange for "a worldwide, perpetual, irrevocable and royalty free license [...] to use the winning Entry", i.e. the algorithm, software and related intellectual property developed, which is "non-exclusive unless otherwise specified". For recruiting competitions, the competition host will screen participants based on their place on the leaderboard, final score, and submitted Scripts if applicable. They will reach out to competitors who look like strong candidates for their open roles to arrange interviews.

Alongside its public competitions, Kaggle also offers private competitions limited to Kaggle's top participants. Kaggle also offers a free tool for data science teachers to run academic machine learning competitions, Kaggle In Class.

Impact of Kaggle competitions

Kaggle has run over 200 data science competitions since the company was founded. It is best known as the platform hosting the $3 million Heritage Health Prize. Other competitions have looked at improving gesture recognition for Microsoft Kinect, or at improving the search for the Higgs boson at CERN.

Competitions have resulted in many successful projects including furthering the state of the art in HIV research, chess ratings and traffic forecasting. Several academic papers have been published on the basis of findings made in Kaggle competitions. A key to this is the effect of the live leaderboard, which encourages participants to continue innovating beyond existing best practice. The winning methods are frequently written up on the Kaggle blog, No Free Hunch.

Financials

In November 2011, Kaggle announced a Series A funding round of $11 million from a number of high-profile Silicon Valley investors. Index Ventures and Khosla Ventures led the round, while Max Levchin, the co-founder of PayPal, also took part and became Chairman of the Board. Another well-known investor is Hal Varian, Chief Economist at Google, who described Kaggle as "a way to organize the brainpower of the world’s most talented data scientists and make it accessible to organizations of every size". Founded in Melbourne, Australia, Kaggle moved to San Francisco in 2011 and experienced a phase of rapid expansion following its fundraising.

References

Kaggle Wikipedia