Two weeks ago we introduced our Kafka Spotguide for Kubernetes – the easiest way to deploy and operate Apache Kafka on Kubernetes. Since then, it’s been integrated into our application and DevOps container management platform, Pipeline, among other spotguides such as Spark on Kubernetes, Zeppelin, NodeJS and Golang, just to name a few.
Because we’ve already met our goal of making it easy set up a Kafka cluster on Kubernetes with just few clicks, and in less than ten minutes – provisioning and operating its entire infrastructure, both in Kubernetes and Kafka – we’ve shifted our focus to Kafka security.
The Pipeline platform enables easy enterprise grade security consumption; you can read more on how we tackle security through multiple layers and components, here, or read about the CIS Kubernetes benchmark we passed, here.
On a default Kafka installation, any user or application can write messages to
topics, as well as read data from
topics. Because Kafka is usually accessed by multiple applications or teams and/or the information flying through it is, confidential security is a must. While there are multiple ways of tackling this problem, cloud and Kubernetes based-environments bring an added level of complexity. This is exactly what the Banzai Cloud Pipeline platform makes simple and automates. Keep reading to learn about our method for securing Kafka on Kubernetes.
Kafka security (or general security) can be broken down into three main areas. Documentation pertaining to Kafka security is available on the Apache Kafka site, but these are the high level topics one should go over when considering how best to secure Kafka:
The Kafka documentation uses the term SSL when it actually means TLS. For consistency’s sake, we will use the term SSL, as well. However, what we mean to say is TLS.
This post is not intended to be an exhaustive Kakfa security guideline, since there’s already a whole lot of documentation out there. In the following sections, we’ll discuss only those security options made available with the Kafka Spotguide.
Messages routed towards, within, or out of a Kafka cluster are unencrypted by default. By enabling SSL support we can avoid man-in-the-middle attacks and securely transmit data over the network. The Banzai Cloud Pipeline Kafka spotguide allows users to chose between four strategies, then the Kafka spotguide does the rest:
In the event someone chooses None, the widely popular (but equally insecure) gRPC and REST proxy for Kafka – Mailgun’s kafka-pixy – is installed. Unfortunately that proxy does not support encryption, thus it’s only available in this case.
The Banzai Cloud Pipeline platform generates the required certificates, but the user can still bring their own. As is usual for Pipeline, the certificates are stored in Vault and managed by our Vault operator for Kubernetes.
Kafka supports multiple auth options; our focus is currently on SASL/SCRAM support, or, to be more specific, SCRAMSSL. SASL stands for _Simple Authorization Service Layer but it’s not simple at all. No problem, we’ve automated everything. This approach comes to us from big data’s legacy – the idea being that authentication should be separated from the Kafka protocol, and username and password hashes should be stored in Zookeeper.
When choosing this option, the Spotguide performs all the required changes, from configuring the brokers to accepting secure connections, to generating a JAAS file.
Once Kafka clients are authenticated, Kafka needs to be able to decide what they can or can’t do. Authorization is our friend in this case, controlled by Access Control Lists (ACL). The Kafka Spotguide adds a set of ACLs when configuring the brokers. There is an
admin user (which works only inside the cluster) with all the rights
super.users=User:admin necessary to create topics, ACLs, and to read/write on all topics. Another user (
username) is created to access topics (
spotguide-kafka topic) from outside of the cluster.
Note that we are using
authorizer.class.name=kafka.security.auth.SimpleAclAuthorizer, however, this can always be changed in the broker config.
Our work doesn’t stop here. Some of our Kafka Spotguide users have been asking for additional features, while at the same time, there are limitations we’d like to address. These are the high level changes coming soon:
Banzai Cloud’s Pipeline provides a platform for enterprises to develop, deploy, and scale container-based applications. It leverages best-of-breed cloud components, such as Kubernetes, to create a highly productive, yet flexible environment for developers and operations teams alike. Strong security measures — multiple authentication backends, fine-grained authorization, dynamic secret management, automated secure communications between components using TLS, vulnerability scans, static code analysis, CI/CD, and so on — are default features of the Pipeline platform.