Girish Mahajan (Editor)

Apache Kafka

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit
Development status
  
Active

Written in
  
Developer(s)
  
Apache Software Foundation

Initial release
  
January 2011; 6 years ago (2011-01)

Stable release
  
0.10.1 / October 20, 2016; 4 months ago (2016-10-20)

Repository
  
git-wip-us.apache.org/repos/asf/kafka.git

Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Its storage layer is essentially a "massively scalable pub/sub message queue architected as a distributed transaction log," making it highly valuable for enterprise infrastructures to process streaming data. Additionally, Kafka connects to external systems (for data import/export) via Kafka Connect and provides Kafka Streams, a Java stream processing library.

Contents

The design is heavily influenced by transaction logs.

History

Apache Kafka was originally developed by LinkedIn, and was subsequently open sourced in early 2011. Graduation from the Apache Incubator occurred on 23 October 2012. In November 2014, several engineers who worked on Kafka at LinkedIn created a new company named Confluent with a focus on Kafka.

Enterprises that use Kafka

The following is a list of notable enterprises that have used or are using Kafka:

  • Walmart
  • Cisco Systems
  • Daumkakao
  • Netflix
  • PayPal
  • Spotify
  • Uber
  • Shopify
  • Betfair
  • Sift Science
  • HubSpot
  • CloudFlare
  • eBay
  • Kafka performance

    Due to its widespread integration into enterprise-level infrastructures, monitoring Kafka performance at scale has become an increasingly important issue. Monitoring end-to-end performance requires tracking metrics from brokers, consumer, and producers, in addition to monitoring ZooKeeper which is used by Kafka for coordination among consumers. There are currently several monitoring platforms to track Kafka performance, both open-source, like LinkedIn's Burrow, as well as paid, like Datadog. In addition to these platforms, collecting Kafka data can also be performed using tools commonly bundled with Java, including JConsole.

    References

    Apache Kafka Wikipedia


    Similar Topics