In-depth introduction to Kubernetes admission webhooks

Monday, September 24th, 2018
Banzai Cloud’s
Pipeline platform is
an operating system
which allows enterprises to develop,
deploy and scale container-based applications. It leverages
best-of-breed cloud components, such as Kubernetes, to
create a highly productive, yet flexible environment for
developers and operation teams alike.
One of the main features of the
Pipeline platform is
that it allows enterprises to run workloads cost effectively
by mixing spot
instances with regular ones, without
sacrificing overall reliability. This requires quite a lot
of behind the scenes magic to be built on top of core
Kubernetes building blocks. In a previous post we already
discussed how we use
Taints and tolerations, pod and node affinities ,
and in this post we'd like to delve into Kubernetes
webhooks. Webhooks are widely used across
Pipeline - but,
in keeping with our spot
instance example above, we use
them to validate and/or mutate deployments when placing pods
on spot
or preemptible
instances.
Kubernetes provides a lot of ways to extend its built-in functionality. Perhaps the most frequently utilized extension points are custom resource types and custom controllers. However, there are some other very interesting features in Kubernetes like admission webhooks or initializers. These are also extension points in the API, so they can be used to modify the basic behaviour of some Kubernetes features. This definition is a little vague, so let's get our hands dirty and take a closer look at dynamic admission control, specifically, within those admission webhooks.
Admission controllers
To start, let's take a look at the definition of admission controllers as it appears in the official Kubernetes documentation. We haven't arrived at admission webhook yet, but we'll be there in a moment.
An admission controller is a piece of code that intercepts requests to the Kubernetes API server prior to persistence of the object, but after the request is authenticated and authorized. [...] Admission controllers may be “validating”, “mutating”, or both. Mutating controllers may modify the objects they admit; validating controllers may not. [...] If any of the controllers in either phase reject the request, the entire request is rejected immediately and an error is returned to the end-user.
This means that there are special controllers that can
intercept Kubernetes API requests, and modify or reject them
based on custom logic. A
list
of previously implemented controllers comes with Kubernetes,
or you can write your own. While they sound powerful, these
controllers need to be compiled into kube-apiserver
, and
can only be enabled when the apiserver starts up.
That's where the dynamic part comes in. Admission webhooks and initializers address these limitations and provide a method of dynamic configuration. Of the two, initializers are the new guys on the block. They're only an alpha feature, and, as of writing, they are seldom used. We may write a blog post about initializers later, but for now let's turn our attention to admission webhooks.
What is an admission webhook?
There are two special admission controllers in the list
included in the Kubernetes apiserver
:
MutatingAdmissionWebhook
and ValidatingAdmissionWebhook
.
These are special admission controllers that send admission
requests to external HTTP callbacks and receive admission
responses. If these two admission controllers are enabled, a
Kubernetes administrator can create and configure an
admission webhook in the cluster.
In broad strokes, the steps for doing that are as follows:
- Check if the admission webhook controllers are enabled in the cluster, and configure them if needed.
- Write the HTTP callback that will handle admission requests. The callback can be a simple HTTP server that's deployed to the cluster, or even a serverless function just like in Kelsey's validating webhook demo.
- Configure the admission webhook through the
ValidatingWebhookConfiguration
andMutatingWebhookConfiguration
resources.
The difference between the two types of admission webhook are pretty self-explanatory: validating webhooks can reject a request, but they cannot modify the object they are receiving in the admission request, while mutating webhooks can modify objects by creating a patch that will be sent back in the admission response. If a webhook rejects a request, an error is returned to the end-user.
If you're looking for a real world example of admission webhooks, check out how the Istio service mesh uses mutating webhooks to automatically inject Envoy sidecar containers into pods.
Creating and configuring an admission webhook
Now that we've covered theory, let's jump into the action and try it out in a real cluster. We'll create a webhook server and deploy it to a cluster first, then create the webhook configuration and see if it works.
Prerequisites
If you'd like to follow along, you'll need a Kubernetes cluster first. You can use Pipeline to create K8s clusters on one of the six supported cloud providers, but you can use any Kubernetes cluster. In the example below, I've created a Kubernetes cluster with Pipeline on Amazon EKS.
Make sure that the MutatingAdmissionWebhook
and
ValidatingAdmissionWebhook
controllers are enabled in the
apiserver
, and check if the admission registration API is
enabled in your cluster by running:
>}} kubectl api-versions
also, check if admissionregistration.k8s.io/v1beta1
is
among the results.
Writing the webhook
We can now write our admission webhook server. In our
example, it will serve as both a validating and a mutating
webhook by listening on two different HTTP paths: validate
and mutate
. Next, we'll figure out a simple task that can
be easily implemented:
The Kubernetes documentation contains a common set of recommended labels that allows tools to work interoperably, describing objects in a common manner that all tools can understand. In addition to supporting tooling, the recommended labels describe applications in a way that can be queried.
In our validating webhook example, we'll make these labels
required on deployments and services, so our webhook will
reject every deployment and every service that doesn't have
these labels set. Then we'll configure our mutating webhook,
which will add any of the missing required labels with
not_available
set as the value.
The complete code for the webhook is available on
Github.
There is a great tutorial about mutating admission webhooks
by morvencao
, and we've used
that repo
as the basis for our blog post by forking and modifying it.
Our webhook will be a simple HTTP server with TLS that's deployed to our cluster as a deployment.
The main logic is in two files: main.go
and webhook.go
.
The main.go
file contains the parts necessary to create
the HTTP server, while webhook.go
contains the webhook
logic that validates and/or mutates requests. For the sake
of keeping this blog post clear, we won't copy large code
snippets here, but feel free to follow the links in the text
that point to sources on Github.
Most of the code is pretty simple; you should take a look.
Start by checking out main.go
; note how the HTTP server is
started
by using standard go
packages, and how the certificates
for the TLS configuration are
read
from command line flags.
The next interesting part is the serve
function.
This is the entry point for handling both the incoming
mutate functions, and validating HTTP requests. The function
unmarshals the AdmissionReview
from the request, does some
basic content-type validation, calls either the
corresponding mutate
or validate
function based on the
URL path, and then marshals the AdmissionReview
response.
The main admission logic is in the validate
and mutate
functions. validate
checks if the admission is
required:
we don't want to validate resources in the kube-system
and
kube-public
namespaces, and don't want to validate a
resource if there's an annotation explicitly telling us to
ignore it
(admission-webhook-example.banzaicloud.com/validate
is set
to false
). If validation is required, the service
or
deployment
resource is unmarshaled from the request, based
on resource kind
, and the labels are compared to their
counterparts. If some labels are missing, Allowed
is set
to false in the response. If validation fails, the reason
for failure will be written in the response and the end-user
will receive it when trying to create a resource.
The
code
for mutate
is very similar, but instead of merely
comparing the labels and putting Allowed
in the response,
a
patch
is created that adds the missing labels to the resource with
not_available
set as its value.
Building the project
It is not necessary to build the project to complete the
following steps, because we already have a Docker container
built and available that can be used. If you're comfortable
with the codebase and you'd like to modify something, you
can build the project, create the Docker container, and push
the container to Docker Hub. The build
script
does that for you. Make sure you have go
, dep
and
docker
installed, that you are logged into a Docker
registry, and that DOCKER_USER
is set like so:
>}} ./build
Deploying the webhook server to the cluster
To deploy the server, we'll need to create a service and a
deployment in our Kubernetes cluster. It's pretty
straightforward, except one thing, which is the server's TLS
configuration. If you'd care to examine the
deployment.yaml
file,
you'll find that the certificate and corresponding private
key files are read from command line arguments, and that the
path to these files comes from a volume mount that points to
a Kubernetes secret:
>}} args: -
-tlsCertFile=/etc/webhook/certs/cert.pem -
-tlsKeyFile=/etc/webhook/certs/key.pem [...] volumeMounts: -
name: webhook-certs mountPath: /etc/webhook/certs readOnly:
true volumes: - name: webhook-certs secret: secretName:
spot-mutator-webhook-certs
In a production cluster it's important to properly handle your TLS certificates and especially private keys, so you may want to use something like cert-manager, or store your keys in Vault, instead of as plain Kubernetes secrets.
We can use any kind of certificates here. The most important
thing to remember is to set the corresponding CA certificate
later in the webhook configuration, so the apiserver
will
know that it should be accepted. For now, we'll reuse the
script
originally written by the Istio team to generate a
certificate signing request. Then we'll send the request to
the Kubernetes API, fetch the certificate, and create the
required secret from the result.
First, run this script and check if the secret holding the certificate and key has been created:
$
./deployment/webhook-create-signed-cert.sh
creating certs in tmpdir
/var/folders/3z/\_d8d8kl951ggyvw360dkd_y80000gn/T/tmp.xPApwE5H
Generating RSA private key, 2048 bit long modulus
..............................................+++
...........+++ e is 65537 (0x10001)
certificatesigningrequest.certificates.k8s.io
"admission-webhook-example-svc.default" created NAME AGE
REQUESTOR CONDITION admission-webhook-example-svc.default 1s
ekscluster-marton-423 Pending
certificatesigningrequest.certificates.k8s.io
"admission-webhook-example-svc.default" approved secret
"admission-webhook-example-certs" created
$ kubectl get secret admission-webhook-example-certs NAME
TYPE DATA AGE admission-webhook-example-certs Opaque 2 2m
Once the secret is created, we can create deployment and service. These are standard Kubernetes deployment and service resources. Up until this point we've produced nothing but an HTTP server that's accepting requests through a service on port 443:
$ kubectl create -f
deployment/deployment.yaml deployment.apps
"admission-webhook-example-deployment" created
$ kubectl create -f deployment/service.yaml service
"admission-webhook-example-svc" created
Configuring the webhook
Now that our webhook server is running, it can accept
requests from the apiserver. However, we should create some
configuration resources in Kubernetes first. Let's start
with our validating webhook, then we'll configure the
mutating webhook later. If you take a look at the
webhook configuration,
you'll notice that it contains a placeholder for
CA_BUNDLE
:
clientConfig: service: name:
admission-webhook-example-webhook-svc path: "/validate"
caBundle: ${CA_BUNDLE}
As mentioned earlier, the CA certificate should be provided
to the admission webhook configuration, so the apiserver
can trust the TLS certificate of the webhook server. Because
we've signed our certificates with the Kubernetes API, we
can use the CA cert from our kubeconfig
to simplify
things. There is a small
script
that substitutes the CA_BUNDLE
placeholder in the
configuration with this CA. Run this command before creating
the validating webhook configuration:
cat
./deployment/validatingwebhook.yaml |
./deployment/webhook-patch-ca-bundle.sh >
./deployment/validatingwebhook-ca-bundle.yaml
Then take a look at validatingwebhook-ca-bundle.yaml
. If
the script ran properly, the CA_BUNDLE
should be populated
like so:
$ cat
deployment/validatingwebhook-ca-bundle.yaml apiVersion:
admissionregistration.k8s.io/v1beta1 kind:
ValidatingWebhookConfiguration metadata: name:
validation-webhook-example-cfg labels: app:
admission-webhook-example webhooks:
- name: required-labels.banzaicloud.com clientConfig:
service: name: admission-webhook-example-webhook-svc
namespace: default path: "/validate" caBundle: LS0...Qo=
rules: - operations: [ "CREATE" ] apiGroups: ["apps", ""]
apiVersions: ["v1"] resources: ["deployments","services"]
namespaceSelector: matchLabels: admission-webhook-example:
enabled
The webhook's clientConfig
is pointing to our previously
deployed service, with the path /validate
. Remember, we've
created two different paths in our HTTP server for
validation and mutation.
The second section contains the rules
- the operations and
resources that the webhook will validate. We'd like to
intercept API requests when a deployment
or a service
is
CREATED
, so apiGroups
and apiVersions
are filled out
accordingly (apps/v1
for deployments
, v1
for
services
). We can use wildcards (*
) for these fields as
well.
The last part of the webhook contains the
namespaceSelector
. We can define a selector for specific
namespaces where our webhook will work. It's not a required
property, but we'll try it out now. Our webhook will only
work in namespaces where the
admission-webhook-example: enabled
label is set. You can
check out the complete layout of this resource configuration
in the Kubernetes reference
docs.
So let's label the default
namespace first:
$ kubectl label
namespace default admission-webhook-example=enabled
namespace "default" labeled
$ kubectl get namespace default -o yaml apiVersion: v1 kind:
Namespace metadata: creationTimestamp: 2018-09-24T07:50:11Z
labels: admission-webhook-example: enabled name: default ...
Finally, create the configuration for the validating webhook. This will dynamically add the webhook to the chain, so, as soon as the resource is created, requests will be intercepted and our webhook will be called:
$ kubectl create -f
deployment/validatingwebhook-ca-bundle.yaml
validatingwebhookconfiguration.admissionregistration.k8s.io
"validation-webhook-example-cfg" created
Try it out
Now the exciting part: let's create a deployment and see if our validation works. We'll take a dummy deployment that contains a container that only sleeps. The command should fail and produce an error like this:
$ kubectl create -f
deployment/sleep.yaml Error from server (required labels are
not set): error when creating "deployment/sleep.yaml":
admission webhook "required-labels.banzaicloud.com" denied
the request: required labels are not set
Okay, let's see if we can make it work. There is another dummy deployment in the repo that contains these labels on the deployment's metadata:
$ kubectl create -f
deployment/sleep-with-labels.yaml deployment.apps "sleep"
created
It's now working, but let's try one more thing. Delete the
deployment and create the
last one,
where the required labels are not present, but set the
admission-webhook-example.banzaicloud.com/validate
annotation to false
. It should work as well.
$ kubectl delete
deployment sleep $ kubectl create -f
deployment/sleep-no-validation.yaml deployment.apps "sleep"
created
Trying out the mutating webhook
To try out the mutating webhook: first, delete the
validating webhook's configuration, so it won't interfere,
then deploy the new configuration. The mutating
webhook configuration
is basically the same as the validating one, but the webhook
service path is set to /mutate
, so the apiserver will send
requests to the other path of our HTTP server. It contains a
CA_BUNDLE
placeholder as well, so we need to populate that
first.
$ kubectl delete
validatingwebhookconfiguration
validation-webhook-example-cfg
validatingwebhookconfiguration.admissionregistration.k8s.io
"validation-webhook-example-cfg" deleted
$ cat ./deployment/mutatingwebhook.yaml |
./deployment/webhook-patch-ca-bundle.sh >
./deployment/mutatingwebhook-ca-bundle.yaml
$ kubectl create -f
deployment/mutatingwebhook-ca-bundle.yaml
mutatingwebhookconfiguration.admissionregistration.k8s.io
"mutating-webhook-example-cfg" created
Now we can deploy our sleep
application again, and see if
the labels were properly added:
$ kubectl create -f
deployment/sleep.yaml deployment.apps "sleep" created
$ kubectl get deploy sleep -o yaml apiVersion:
extensions/v1beta1 kind: Deployment metadata: annotations:
admission-webhook-example.banzaicloud.com/status: mutated
deployment.kubernetes.io/revision: "1" creationTimestamp:
2018-09-24T11:35:50Z generation: 1 labels:
app.kubernetes.io/component: not_available
app.kubernetes.io/instance: not_available
app.kubernetes.io/managed-by: not_available
app.kubernetes.io/name: not_available
app.kubernetes.io/part-of: not_available
app.kubernetes.io/version: not_available ...
For our last example, recreate the validating webhook so
both of them are available. Now, try to create sleep
again. It should succeed because, as it's put in the
documentation:
The admission control process proceeds in two phases. In the first phase, mutating admission controllers are run. In the second phase, validating admission controllers are run.
So the mutating webhook adds the missing labels in the first
phase, then the validating webhook won't reject the
deployment in the second phase, because the labels are
already present, with not_available
set as their value.
$ kubectl create -f
deployment/validatingwebhook-ca-bundle.yaml
validatingwebhookconfiguration.admissionregistration.k8s.io
"validation-webhook-example-cfg" created
$ kubectl create -f deployment/sleep.yaml deployment.apps
"sleep" created