Deploy Friday: Your source for everything Open Source

#14: Apache Kafka

March 17, 2021 Larry Garfield, Ricardo Ferreira, Anna McDonald, Otavio Santana Season 1 Episode 14
Deploy Friday: Your source for everything Open Source
#14: Apache Kafka
Show Notes

Defining Kafka

Apache Kafka is an open-source stream-processing software platform which aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Anna McDonald, one of the Apache Kafka experts we speak with in this episode, describes it like this, “The easiest way to describe it is a durable log. And that makes it different from most all other messaging systems, where a message goes in, you have to broadcast it out. And as soon as it's consumed, it's gone. And Apache Kafka, it lives until your retention period, which is fantastic. ” Ricardo Ferreria, another expert on Apache Kafka, adds to this definition, “Kafka is not actually a messaging technology, it’s more of a data streaming technology.”

How Kafka is different from other messaging brokers

  • Pull-based  — Kafka data consumers  fetch data when, how, and how fast they need it, without affecting performance. In push-based messaging technologies, the broker pushes messages to consumers, creating potential bottlenecks and scalability issues.
  • Schema-free — Kafka is by nature schema-less, so you can use the format that you want, or the one that your producers and consumers are going to use to communicate.
  • High volume — Anna says, “Nothing can handle volume like Kafka. That’s my favorite thing.” 
  • Backwards compatibility — Anna offers, “Kafka does a better job of having backwards compatibility than anything I've ever seen in my entire life, like in terms of the clients and not breaking stuff. It's fantastic.”
  • The Kafka ecosystem — Kafka is integrated with APIs like Kafka Streams, Producer, Consumer, Admin, and Connect.

Treat Kafka as your “single source of truth”

Both of our experts today talk about treating Kafka as your “single source of truth,” your system of record. Ricardo explains further, “You can have multiple different consumers interested in the same data set, but each one of them are using the data set differently. So when you start using Kafka this way, you start building architectures that are not only super resilient and scalable, but also it is a very good replacement for very expensive end databases.”

Try Apache Kafka on Platfporm.sh today to get your “single source of truth” 

Platform.sh
Learn more about us.
Get started with a free trial.
Have a question? Get in touch!

Platform.sh on social media
Twitter @platformsh
Twitter (France): @platformsh_fr
LinkedIn: Platform.sh
LinkedIn (France): Platform.sh
Facebook: Platform.sh

Watch, listen, and subscribe to the Platform.sh Deploy Friday podcast:
YouTube
Apple Podcasts
Buzzsprout

Platform.sh is a robust, reliable hosting platform that gives development teams the tools to build and scale applications efficiently. Whether you run one or one thousand websites, you can focus on creating features and functionality with your favorite tech stack and leave managing infrastructure and processes to us.