
David Konatschnig
IT-Architect
Until now, running a Kafka Cluster was considered a big challenge, as it requires a lot of know-how and experience.
This has changed with the development of Kubernetes operators. An operator provides support by automating complex work steps. The use of an operator is particularly suitable for PoCs (Proof-of-Concept), as it is very easy to get to an operational cluster and thus gain experience quickly. The public cloud is ideal as a basis for an elastic setup.
Author: David Konatschnig
Since I wrote my blog post Apache Kafka and what the hype is about 3 years ago, a lot has happened in the world of stream processing. Apache Kafka is definitely no longer hype, as impressive figures prove: According to Confluent
Last but not least, one of the main reasons is probably the wide range of possible use cases. Mario Maric describes some of these in his blog post.
The great popularity has also created a strongly growing community that works intensively on the development of Apache Kafka as well as on tools and products in the Kafka ecosystem.
However, setting up and running a Kafka cluster and the ecosystem (Kafka Connect, Schema Registry, KSQL, ...) poses challenges for IT departments. It requires a lot of technical know-how and an appropriately positioned team.
There are different approaches to running Kafka clusters:
In this blog, I would like to show you what you need to deploy a self-managed Kafka cluster in the public cloud with the help of an operator in just a few steps.
Containers are indispensable in today's cloud-native world. This is also true for Kafka. Although Kafka can also be deployed on bare-metal and VMs, the de-facto standard today is container-based with an orchestrator like Kubernetes. Besides the ease of scaling that Kubernetes brings, there are many other advantages. One of them is the Kubernetes operators.
A Kubernetes operator is nothing more than an extension to Kubernetes that creates a desired state on a cluster. In simple terms, it defines a state, i.e. what should be deployed on the cluster (e.g. a Kafka deployment with 3 brokers). The operator interprets this state and performs corresponding operations to achieve it, and permanently checks that this state remains so. Especially in a complex deployment like Kafka, where many configurations have to interact, which are usually maintained "manually", an operator is an enormous support.
Various commercial and opensourced projects have taken advantage of this operator pattern by developing Kafka cluster operators. Different approaches have proven successful, with one constructed by Confluent, the driving force behind the open source Apache Kafka project, and an opensource project called Strimzi.
More and more companies are venturing into the cloud - and rightly so, because the advantages are obvious: time-to-market, optimisation of costs, more agility, etc.
Especially in terms of agility, the cloud can score points: Managed Kubernetes clusters can be deployed and scaled in the cloud with just a few clicks or CLI commands. Gone are the days when you wait several days for a service or system to be made available after creating an order ticket.
While you can easily click together the infrastructure for a PoC, for a production setup, you want this process to be predictable and repeatable. This task can be accomplished with the help of infrastructure-as-code (IaC) tools. One of the best-known tools in this area is Terraform from Hashicorp. With Terraform, cloud infrastructure can be described declaratively (i.e. on the basis of a desired target state) and provisioned accordingly, and this across all large hyperscalers such as Google, Amazon, Microsoft or even Alibaba Cloud. This can be attractive if you want to pursue a hybrid cloud strategy and thus serve several cloud providers simultaneously.
Due to the infrastructure description based on state files, the whole thing can also be integrated very well into a GitOps process. This means that changes to the infrastructure are versioned and can only be carried out with an approval.
The following steps should be considered if you want to run Kafka self-managed in the public cloud:
Choosing a cloud provider
Managed Kubernetes offers are now available from all hyperscalers, i.e. Google Cloud, Amazon and Microsoft. Depending on how far along your company already is with the cloud journey, the provider may already be predetermined. It is also imperative to consider governance aspects here. This includes questions like:
Where is the data?
How must the data be encrypted?
Who has access to the data?
Combine the capabilities of a Kafka operator with the advantages of the public cloud and you have the best of both worlds. The following points speak for their use:
The Kafka process model helps you to proceed in a structured manner.