Puneet Varma (Editor)

GraphLab

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit
Development status
  
Acquired by Apple Inc.

Operating system
  
Written in
  
C++

Stable release
  
v2.2 / July 1, 2013 (2013-07-01)

Type
  
Machine Learning Platform

Turi is a graph-based, high performance, distributed computation framework written in C++. The GraphLab project was started by Prof. Carlos Guestrin of Carnegie Mellon University in 2009. It is an open source project using an Apache License. While GraphLab was originally developed for Machine Learning tasks, it has found great success at a broad range of other data-mining tasks; out-performing other abstractions by orders of magnitude.

Contents

Motivation

As the amounts of collected data and computing power grows (multicore, GPUs, clusters, clouds), modern datasets no longer fit into one computing node. Efficient distributed/parallel algorithms for handling large scale data are required. The GraphLab framework is a parallel programming abstraction targeted for sparse iterative graph algorithms. GraphLab provides a high level programming interface, allowing a rapid deployment of distributed machine learning algorithms. The main design considerations behind the design of GraphLab are:

  • Sparse data with local dependencies
  • Iterative algorithms
  • Potentially asynchronous execution
  • Main features of GraphLab are:

  • A unified multicore and distributed API: write once run efficiently in both shared and distributed memory systems
  • Tuned for performance: optimized C++ execution engine leverages extensive multi-threading and asynchronous IO
  • Scalable: GraphLab intelligently places data and computation using sophisticated new algorithms
  • HDFS Integration
  • Powerful Machine Learning Toolkits
  • GraphLab Toolkits

    On top of GraphLab, several implemented libraries of algorithms:

  • Topic Modeling- contains applications like LDA which can be used to cluster documents and extract topical representations.
  • Graph Analytics- contains application like pagerank and triangle counting which can be applied to general graphs to estimate community structure.
  • Clustering- contains standard data clustering tools such as Kmeans
  • Collaborative Filtering- contains a collection of applications used to make predictions about users interests and factorize large matrices.
  • Graphical Models- contains tools for making joint predictions about collections of related random variables.
  • Computer Vision- contains a collection of tools for reasoning about images.
  • Award Winning Software

    A solution based on Graphlab collaborative filtering library won the 5th place in ACM Yahoo! KDD CUP challenge, track1, out of more than 1000 participants. LeBuShiShu team used a mixture of 12 different algorithms and deployed 10,000 CPU hours on BlackLight supercomputer. Most of the utilized algorithms and techniques are now part of the GraphLab Collaborative FIltering Toolkit.

    Turi

    Turi company (formerly Dato and before that GraphLab Inc.) was founded by Prof. Carlos Guestrin from University of Washington on May 2013 to continue development support of the GraphLab open source project. Dato Inc. raised 6.75M$ from Madrona and New Enterprise Associates in A round, and 18.5M$ in B round from Vulcan Capital and Opus Capital, as well as Madrona and New Enterprise Associates. On August 5, 2016, Turi was acquired by Apple Inc. for $200,000,000.

    References

    GraphLab Wikipedia


    Similar Topics