Accessing Kafka on Google Kubernetes Engine from the outside world

Accessing Kafka on Google Kubernetes Engine from the outside world

Running and operating a Kafka cluster on Kubernetes can be a tricky part. There are already many tutorials, blogpost, Helm Charts, Operators, … that will give you detailed information on installing a working Kafka cluster on Kubernetes with StatefulSets, Headless services and other Kubernetes features. But after going through all those and when finally getting a cluster up and running, many times another challenge appears: how can we access the Kafka cluster from outside Kubernetes?

By using a Headless Service and a StatefulSet, each Pod will have a stable network id and a matching DNS entry, taking the form $(pod name)-$(service-name). In the case of our Kafka example, client application running in the Kubernetes cluster can access the Kafka brokers using kafka-0.kafka, kafka-1-kafka, kafka-2.kafka. But how can we make those Kafka pods individually accessible from the outside world? In this short post, 2 strategies will be explained briefly.

First of all, in both solution, all Kafka Pods should be accessible and discoverable from inside Kubernetes and from outside Kubernetes as well. One of the configuration features of a Kafka broker is the possibility of configuring multiple listeners and multiple advertised listeners. In our case, we will use a listener for internal communication and a second one that will give as remote access to the Kafka brokers.

For example:


export KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
export KAFKA_LISTENERS=INTERNAL://0.0.0.0:9092,EXTERNAL://0.0.0.0:9093
export KAFKA_ADVERTISED_LISTENERS=INTERNAL://kafka-${HOSTNAME##*-}.kafka:9092,EXTERNAL://<your external name for this pod>:9093
export KAFKA_INTER_BROKER_LISTENER_NAME=INTERNAL

_tip: the hostname of a pod in a StatefulSet is in the form of -, using Bash expansion the index can be extracted by ${HOSTNAME##*-}_

An external address for each Kafka Pod

When running on a Cloud Provider, such as Google Kubernetes Engine, a first simple approach to expose each instance of our Kafka cluster to the outside world is by creating a Services of type LoadBalancer which needs to route traffic to a specific Pod of the StatefulSet. Luckily, when the StatefulSet controller creates a Pod, it also adds a Label statefulset.kubernetes.io/pod-name with the unique name of the Pod. Using this label, we can create exposed Services for each Pod of our Kafka StatefulSet

An example of such a service definition:


apiVersion: v1
kind: Service
metadata:
name: kafka-0
spec:
type: LoadBalancer
ports:
– port: 9093
name: kafka
selector:
service: kafka
statefulset.kubernetes.io/pod-name: kafka-0

When applying those service definitions for each Pod of the StatefulSet, the cloud provider will provision the necessary resources (such as firewalls, network loadbalancers, …) and in the end you will have an external address for each Kafka broker runnnig on your Kubernetes cluster. Last things to do is configuring the Pods with their external DNS name, and of course configuring a DNS entry for each given external address.


export KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
export KAFKA_LISTENERS=INTERNAL://0.0.0.0:9092,EXTERNAL://0.0.0.0:9093
export KAFKA_ADVERTISED_LISTENERS=INTERNAL://kafka-${HOSTNAME##*-}.kafka:9092,EXTERNAL://kafka-${HOSTNAME##*-}.example.com:9093
export KAFKA_INTER_BROKER_LISTENER_NAME=INTERNAL

$ kafkacat -L -b kafka-0.example.com
Metadata for all topics (from broker 0: kafka-0.example.com:9093/0):
3 brokers:
broker 2 at kafka-2.example.com:9093
broker 1 at kafka-1.example.com:9093
broker 0 at kafka-0.example.com:9093
1 topics:
topic “__confluent.support.metrics” with 1 partitions:
partition 0, leader 0, replicas: 0, isrs: 0
$

This approach is used in the tutorial Kafka-as-a-Service on GKE with Terraform

The only caveat here is that you need to create a dns entry for each pod, so on to a next solution.

One external address with a different port for each Kafka Pod

If we would like to have one external address for each of your Kafka Pods, we will need some kind of load balancer that will route traffic to the correct Kafka Pod. AppsCode Voyager Ingress Controller to the rescue!

Voyager is a HAProxy backed secure L7 and L4 ingress controller for Kubernetes developed by AppsCode.
This can be used with any Kubernetes cloud providers including aws, gce, gke, azure, acs.
This can also be used with bare metal Kubernetes clusters.

The two feature of AppsCode Voyager to expose our Kafka cluster:

  1. TCP LoadBalancing
  2. Forward Traffic to specific Pods of a StatefulSet

So, after installing our Kafka cluster and the AppsCode Voyager, let’s add an ingress definition to achieve our goal:


apiVersion: voyager.appscode.com/v1beta1
kind: Ingress
metadata:
name: kafka-ingress
labels:
service: kafka
spec:
labels:
service: kafka
rules:
– tcp:
port: ‘9000’
backend:
hostNames:
– kafka-0
serviceName: kafka
servicePort: ‘9093’


– tcp:
port: ‘9001’
backend:
hostNames:
– kafka-1
serviceName: kafka
servicePort: ‘9093’


– tcp:
port: ‘9002’
backend:
hostNames:
– kafka-2
serviceName: kafka
servicePort: ‘9093’

Change your broker configuration:


export KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
export KAFKA_LISTENERS=INTERNAL://0.0.0.0:9092,EXTERNAL://0.0.0.0:9093
export KAFKA_ADVERTISED_LISTENERS=INTERNAL://kafka-${HOSTNAME##*-}.kafka:9092,EXTERNAL://kafka.example.com:900${HOSTNAME##*-}
export KAFKA_INTER_BROKER_LISTENER_NAME=INTERNAL

When applied correctly:


$ kafkacat -L -b kafka.example.com:9000
Metadata for all topics (from broker 0: kafka.example.com:9000/0):
3 brokers:
broker 2 at kafka.example.com:9002
broker 1 at kafka.example.com:9001
broker 0 at kafka.example.com:9000
1 topics:
topic “__confluent.support.metrics” with 1 partitions:
partition 0, leader 0, replicas: 0, isrs: 0
$

 

 

 Follow us on Facebook – Follow us on LinkedIn – Follow us on Twitter 

Credits: blogpost by Johan Siebens, Kubernetes Master at ToThePoint

Leave a Comment

Your email address will not be published. Required fields are marked *