Programster's Blog

Tutorials focusing on Linux, programming, and open-source

Getting Started With Apache Kafka

This tutorial will get us started with Apache Kafka on a Debian 8.

What Is Kafka?

Kafka is a new publish-subscribe messaging system that uses a distributed, partitioned, and replicated. It can scale horizontally without downtime, and achieves durability with messages being persisted on disk and replicated within the cluster to prevent data loss. Apparently, each broker can handle terabytes of messages without performance impact, but I have not tested this.

Requirements

Steps

Go to the downloads page and download the latest release which at the time of writing this tutorial is 0.9.0.1. Then extract the source code and navigate to within the extracted folder:

wget http://ftp.fau.de/apache/kafka/0.9.0.1/kafka_2.11-0.9.0.1.tgz
tar --extract --gzip --file kafka*
cd kafka_*

Kafka requires Zookeeper (another Apache product), in order to run. Setting up Zookeeper cluster is beyond the scope of this tutorial, so for now we will just have this instance run a zookeeper cluster of one. Luckily, the Kafka download provides an easy way to do this:

./bin/zookeeper-server-start.sh config/zookeeper.properties

Now we can start the kafka server.

./bin/kafka-server-start.sh config/server.properties

We now have a Kafka server running. Next time we'll learn about interfacing with it using PHP and setting up an actual cluster.

Last updated: 16th August 2018
First published: 16th August 2018

This blog is created by Stuart Page

I'm a freelance web developer and technology consultant based in Surrey, UK, with over 10 years experience in web development, DevOps, Linux Administration, and IT solutions.

Need support with your infrastructure or web services?

Get in touch