Kafka on Kubernetes — a good fit?

7 min readApr 4, 2019

Introduction

Kubernetes is designed to run stateless workloads. These workloads typically come in the form of a microservices architecture, are lightweight, scale well horizontally, adhere to the 12-factor app principles, and can deal with circuit breakers and chaos monkeys.

Kafka on the other side is essentially a distributed database. This means you have to deal with state and it is much more heavyweight than a microservice. Kubernetes supports stateful workloads but you have to treat it with caution as Kelsey Hightower points out in two recent tweets:

Now should you be running Kafka on Kubernetes? My counterquestion is: Does Kafka run better without it? That’s why I’m going to point out how Kafka and Kubernetes might complement each other and what pitfalls you might encounter.

Runtime

Let’s have a look at the basic stuff first — the runtime itself.

Process

Kafka brokers are CPU-friendly. TLS might introduce some overhead. And Kafka clients need more CPU if they are using encryption, but this does not impact the brokers.

Memory

Kafka brokers are memory-eaters. The JVM heap can be usually limited to 4–5 GB, but you also need enough system memory because Kafka makes heavy use of the page cache. In Kubernetes, set the container resource limits and requests accordingly.

Storage

Storage in containers is ephemeral — data will be lost after a restart. You could also use an emptyDir volume for the Kafka data which would have the same effect: your broker data will be lost after termination. Your messages might still be available as replicas on other brokers. So after a restart, the failing broker first has to replicate all the data which might be a time-consuming process.

That’s why you should be using persistent storage. Use non-local persistent block storage with XFS or ext4 to be more precise. Don’t use NFS. I warned you. Neither NFS v3 nor v4 will work. In short, a Kafka broker will terminate itself if it’s not able to delete a data directory due to the NFS “silly rename” problem. If you still don’t believe me then read this blog post very carefully. The storage has to be non-local so that Kubernetes is more flexible in choosing another node after restarts or relocations.

Network

As with most distributed systems, Kafka performance heavily depends on low network latency and high bandwidth. Don’t be tempted to put all brokers on the same node as this would reduce availability. If a Kubernetes node goes down then the whole Kafka cluster goes down. Don’t stretch a Kafka cluster across data centers either. The same applies to Kubernetes clusters. Different availability zones are a good trade-off.

Configuration

Plain Manifests

The Kubernetes website contains a very good tutorial on how to set up ZooKeeper using manifests. As ZooKeeper is part of Kafka this is a good starting point to learn which Kubernetes concepts are being applied. Once understood, you can use the same concepts for a Kafka cluster, too.

Pod: A pod is the smallest deployable unit in Kubernetes. It contains your workload and it represents a process in your cluster. A pod contains one or more containers. Each ZooKeeper server in the ensemble and each Kafka broker in the Kafka cluster will run in a separate pod.
StatefulSet: A StatefulSet is a Kubernetes object that deals with multiple stateful workloads which need coordination. StatefulSets provide guarantees about the ordering and uniqueness of the pods.
Headless Services: Services decouple the pods from clients through a logical name. Kubernetes takes care of load-balancing. However, with stateful workloads such as ZooKeeper and Kafka, clients have to communicate with a specific instance. This is where headless services come into play: as a client you still get a logical name but you don’t have to access the pod directly.
Persistent Volume: They are needed to configure non-local persistent block storage as mentioned above.

Yolean provides a comprehensive set of manifests to get you started with Kafka on Kubernetes.

Helm Charts

Helm is a package manager for Kubernetes comparable to OS package managers like yum, apt, Homebrew or Chocolatey. It allows you to install predefined software packages described in Helm charts. A well curated Helm chart simplifies the complex task of properly configuring all the parameters to run Kafka on Kubernetes. Several charts for Kafka are available: An official one in incubator state, one from Confluent and another one from Bitnami, to name a few.

Operators

Due to some limitations of Helm another tool is becoming quite popular: Kubernetes operators. An operator not only packages a software for Kubernetes, it also allows you to deploy and manage a piece of software for Kubernetes.

The list of awesome operators mentions two operators for Kafka — one of them being Strimzi. Strimzi makes it really easy to spin up a Kafka cluster in minutes. Almost no configuration is needed and it adds some nifty features like inter-cluster point-to-point TLS encryption. Confluent also announces an operator which will be coming soon.

Performance

Running performance tests to benchmark your Kafka installation is very important. It gives you insights about possible bottlenecks before you run into trouble. Luckily, Kafka already ships with two performance test tools: kafka-producer-perf-test.sh and kafka-consumer-perf-test.sh. Use them intensively. For reference, you can use the results from this blog by Jay Kreps or this review of Amazon MSK by Stéphane Maarek.

Operations

Monitoring

Visibility is very important otherwise you won’t know what’s going on. Nowadays there is decent tooling to do monitoring with metrics in a cloud native manner. Prometheus and Grafana are two popular tools. Prometheus can collect metrics from all the Java processes (Kafka, Zookeeper, Kafka Connect) with the JMX exporter in a straightforward way. Adding cAdvisor metrics gives you additional insights about Kubernetes resource usage.

Strimzi has a very nice example Grafana dashboard for Kafka. It visualizes key metrics like under-replicated and offline partitions in a very intuitive way. It complements those metrics with resource usage and performance as well stability indicators. So you get basic Kafka cluster monitoring for free!

Source: https://strimzi.io/docs/master/#kafka_dashboard

You would have to complement this with client monitoring (consumer and producer metrics), as well as lag monitoring with Burrow and end-to-end monitoring with Kafka Monitor.

Logging

Logging is another crucial part. Make sure that all containers in your Kafka installation log to stdout and stderr and make sure your Kubernetes cluster aggregates all the logs in a central logging infrastructure such as Elasticsearch.

Health checks

Kubernetes uses liveness and readiness probes to find out if your pods are healthy. If the liveness probe fails, Kubernetes will kill the container and automatically restart it if the restart policy is set accordingly. If the readiness probe fails then Kubernetes will remove the pod from serving requests through a service. This means that no manual intervention is needed anymore in such cases which is a big plus.

Rolling updates

StatefulSets support automated updates: The RollingUpdate strategy will update each Kafka pod one at a time. With this you can reach zero-downtime. Another big plus that comes with Kubernetes.

Scaling

Scaling a Kafka cluster is not an easy task. However, Kubernetes makes it very easy to scale pods to a certain number of replicas which means that the desired number of Kafka brokers can be defined declaratively. The hard part will be to reassign the partitions after scaling up or before scaling down.

Administration

Administration tasks of your Kafka cluster like creating topics and reassigning partitions can be done with the existing shell scripts by opening a shell into your pods. This is not a nice solution though. Strimzi supports the management of topics with another operator. There’s room for improvement.

Backup & Restore

The availability of Kafka now also depends on the availability of Kubernetes. If your Kubernetes cluster goes down then your Kafka cluster goes down as well in a worst-case scenario. Murphy’s law tells you that this will happen to you too and you will lose data. To mitigate this risk, make sure that you have a backup concept in place. MirrorMaker is one possibility, another one would be to leverage S3 for backups with a connector, as described in this blog post by Zalando.

Conclusion

For small to medium sized Kafka clusters I would definitely go with Kubernetes as it provides more flexibility and will simplify operations. If you have very high non-functional requirements in terms of latency and/or throughput then a different deployment option might be more beneficial.

📝 Read this story later in Journal.

👩‍💻 Wake up every Sunday morning to the week’s most noteworthy stories in Tech waiting in your inbox. Read the Noteworthy in Tech newsletter.