How we manage clusters by extending the kubernetes api
At BESTSELLER we run multiple Kubernetes clusters in multiple clouds, which gives us some assurance that even if one provider or one region is degraded we are still able to serve our customers. But multiple clusters, multiple clouds and multiple teams can be a bit difficult to grasp as an engineer. That is why we decided to use the extendability of the Kubernetes API to create a cluster-registry. In this post, we will cover how to combine Custom Resource Definitions (CRD) with Admission controllers to gain control of your custom Kubernetes Resources.
What is a CRD and Admission Controller
To understand CRDs, we need to understand the basic concept of resources in Kubernetes.
- A resource is an API endpoint where you can store API objects of any kind.
- A custom resource allows you to define your own API objects, and thus creating your own Kubernetes kind just like Deployments or Statefulsets.
In short, the Custom Resource Definition is where you define your Custom Resource that extends Kubernetes' default capabilities.
While the CRDs extend the Kubernetes functionality, Admission controllers govern and enforce how the cluster is used. They can be thought of as a gatekeeper that intercepts (authenticated) API requests and may change the request object or deny the request altogether.
There are two types of Admission controllers; validating and mutating. Mutating admission webhooks are invoked first and can modify objects sent to the API server to enforce custom defaults. After all object modifications are complete, validating admission webhooks are invoked which runs logic to validate the incoming resource. In case the validation webhook rejects the request, the Kubernetes API returns a failed HTTP response to the user.
Creating our cluster specification
Let's start with the easiest part, creating our custom resource definition, in this case, our cluster specification template.
1apiVersion: apiextensions.k8s.io/v1
2kind: CustomResourceDefinition
3metadata:
4 name: clusters.extreme.bestseller.com
5spec:
6 group: extreme.bestseller.com
7 scope: Namespaced
8 names:
9 kind: Cluster
10 plural: clusters
11 singular: cluster
12 # list of versions supported by our CustomResourceDefinition
13 versions:
14 - name: v1alpha1
15 served: true
16 storage: true
17 schema:
18 openAPIV3Schema:
19 type: object
20 required: ["spec"]
21 properties:
22 LastRun:
23 type: string
24 status:
25 enum: ["", "Active", "Deploying", "Rerun", "Upgrading", "Delete", "Deleting", "Deleted"]
26 type: string
27 # our custom fields in the resources
28 spec:
29 type: object
30 required: ["NodeCount", "Cloud"]
31 properties:
32 NodeCount:
33 type: integer
34 ContactPerson:
35 type: string
36 Cloud:
37 type: string
38 # a list of additional fields to print when doing e.g. GET operation.
39 additionalPrinterColumns:
40 - jsonPath: .spec.ContactPerson
41 description: Contact Person
42 name: ContactPerson
43 type: string
in-depth details on CRDs click here
From the simplified example above we have defined a new api group extreme.bestseller.com
and in that group our CRD clusters.extreme.bestseller.com
is stored.
We have defined 3 fields in our clusters specs, Node Count
, Cloud
and Contact Person
where the first two are required.
Implementing the actual CRD is as easy as:
1kubectl apply -f ourcrd.yaml
Time to create our first cluster object! More YAML coming up.
1apiVersion: "extreme.bestseller.com/v1alpha1"
2kind: Cluster
3metadata:
4 name: destinationaarhus-techblog
5spec:
6 ContactPerson: Peter Brøndum
7 Cloud: GCP
8 NodeCount: 3
Apply it to our cluster:
1kubectl apply -f firstcluster.yaml
Now we can get our clusters with kubectl
just as any other Kubernetes kind:
1> kubectl get clusters
2NAME CONTACTPERSON
3destinationaarhus-techblog Peter Brøndum
With our cluster spec and storage in place, it is time for the fun part.
The Admission Controller
The admission controller, in this case a mutating webhook, consists of two elements.
- A MutatingWebhookConfiguration, which defines which resources is subject to mutation and which mutating service to call.
- An admission webhook server, which does the mutation.
First up is the MutatingWebhookConfiguration
. We can divide this into two blocks. The first is clientConfig
. Here we configure which admission webhook service to call (can be an external service as well). Next is the rules
, where we specify that mutation can only happen on Create
and Update
requests to the Kubernetes API and only on our Cluster resourcers.
1apiVersion: admissionregistration.k8s.io/v1
2kind: MutatingWebhookConfiguration
3metadata:
4 name: cluster-mutate
5webhooks:
6- admissionReviewVersions:
7 - v1beta1
8 clientConfig:
9 # As webhooks can only be called over HTTPS this should be the actual caBundle
10 caBundle: "Ci0tLS0tQk...<`caBundle` is a PEM encoded CA bundle which will be used to validate the webhook's server certificate.>...tLS0K"
11 # the internal k8s service to call
12 service:
13 name: cluster-mutate
14 namespace: default
15 path: "/mutate"
16 failurePolicy: Fail
17 name: cluster-mutate.default.svc
18 rules:
19 - apiGroups:
20 - "extreme.bestseller.com"
21 apiVersions:
22 - v1alpha1
23 operations:
24 - "CREATE"
25 - "UPDATE"
26 resources:
27 - "clusters"
28 sideEffects: None
29 timeoutSeconds: 30
Kubernetes will only accept a ssl encrypted endpoint, which i will not cover this in this article, but we are in luck, other people have made simple scripts that can help us e.g giantswarm
The webhook
With this in place, we need to create the actual mutation logic.
Before we deep dive into the code, I have chosen to write this in Go
as it has a native client for Kubernetes. That being said, you could do this in the language of your choosing. The only requirement is to create a web server that serves a TLS endpoint and accepts and responds with JSON.
This will be a simplified example, and I have tried to squeeze everything into one file. In essence what we are aiming at is to:
- Recieve a JSON request, in Kubernetes terms an AdmissionReview.
- Do our mutation logic.
- Return a JSON response, again in the format of an AdmissionReview, which tells Kubernetes what to mutate.
Yes! you are correct, it is actually Kubernetes that does the mutation.
To the code!
1package main
2
3import (
4 "encoding/json"
5 "fmt"
6 "io/ioutil"
7 "log"
8 "net/http"
9 "os"
10 "time"
11
12 "github.com/gorilla/handlers"
13 "github.com/gorilla/mux"
14 "k8s.io/api/admission/v1beta1"
15 metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
16)
17
18// ClusterSpec the crd spec
19type ClusterSpec struct {
20 metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`
21
22 APIVersion string `json:"apiVersion"`
23 Kind string `json:"kind"`
24 Status string `json:"status"`
25 LastRun time.Time `json:"LastRun,omitempty"`
26 Spec struct {
27 ContactPerson string `json:"ContactPerson"`
28 NodeCount int64 `json:"NodeCount"`
29 Cloud string `json:"Cloud"`
30 } `json:"spec"`
31}
32
33func mutate(w http.ResponseWriter, r *http.Request) {
34 // extract body, it would be beneficial to check if it is empty :)
35 body, err := ioutil.ReadAll(r.Body)
36 if err != nil {
37 log.Printf("could not read body: %v", err)
38 http.Error(w, fmt.Sprintf("could not read body: %v", err), http.StatusInternalServerError)
39 return
40 }
41 defer r.Body.Close()
42
43 // unmarshal body into AdmissionReview struct
44 arRequest := v1beta1.AdmissionReview{}
45 if err := json.Unmarshal(body, &arRequest); err != nil {
46 log.Printf("incorrect body: %v", err)
47 http.Error(w, fmt.Sprintf("incorrect body: %v", err), http.StatusInternalServerError)
48 return
49 }
50
51 // unmarshal cluster
52 cluster := ClusterSpec{}
53 if err := json.Unmarshal(arRequest.Request.Object.Raw, &cluster); err != nil {
54 log.Printf("error deserializing cluster: %v", err)
55 http.Error(w, fmt.Sprintf("error deserializing cluster: %v", err), http.StatusInternalServerError)
56 return
57 }
58
59 // Lets mutate! if no contact is defined i will be the contact. Which irl i would quickly regret.
60 if cluster.Spec.ContactPerson == "" {
61 cluster.Spec.ContactPerson = "Peter Brøndum"
62 log.Println("No contact, Mutate me!")
63 }
64
65 // response options
66 pT := v1beta1.PatchTypeJSONPatch
67 arResponse := v1beta1.AdmissionReview{
68 Response: &v1beta1.AdmissionResponse{
69 Allowed: true,
70 UID: cluster.UID,
71 PatchType: &pT,
72 Result: &metav1.Status{
73 Message: "success",
74 },
75 },
76 }
77
78 // okay so this is in truth the actaul mutation, as you can see it is kubernetes
79 // that does the mutation, we just tell it what it should do for us!
80 // this is why we use JSONPatch as well.
81 p := []map[string]string{}
82 p = append(p, map[string]string{
83 "op": "replace",
84 "path": "/spec/ContactPerson",
85 "value": cluster.Spec.ContactPerson,
86 })
87
88 arResponse.Response.Patch, _ = json.Marshal(p)
89
90 responseBody, err := json.Marshal(arResponse)
91 if err != nil {
92 log.Printf("can't encode response: %v", err)
93 http.Error(w, fmt.Sprintf("can't encode response: %v", err), http.StatusInternalServerError)
94 return
95 }
96
97 w.WriteHeader(http.StatusOK)
98 w.Write(responseBody)
99}
100
101func main() {
102 fmt.Println("Cluster Contact Mutater has started")
103
104 // define http endpoints and start
105 router := mux.NewRouter()
106 router.HandleFunc("/mutate", mutate)
107
108 loggedRouter := handlers.LoggingHandler(os.Stdout, router)
109 log.Fatal(http.ListenAndServeTLS(":443", "./certs/crt.pem", "./certs/key.pem", loggedRouter))
110}
In short, the example creates a single HTTP endpoint. When called, it will unmarshal the body into our cluster specification (along with default Kubernetes stuff) and check if a Contact Person is present. If not, I, Peter Brøndum, will be set as a contact. Then it will marshal it back to JSON and send the response to Kubernetes. This response is used by Kubernetes to do the actual mutation.
Lets see in Action
I have deployed the Webhook and the MutatingWebhookConfiguration
. Let's prepare a new cluster spec. Notice that we do not add a contact to this cluster!
1apiVersion: "extreme.bestseller.com/v1alpha1"
2kind: Cluster
3metadata:
4 name: destinationaarhus-techblog02
5spec:
6 Cloud: GCP
7 NodeCount: 3
When we apply this spec, we dont se any difference, as long as the webhook sends a status 200.
1> kubectl apply -f manifests/cluster02.
2cluster.extreme.bestseller.com/destinationaarhus-techblog02 created
But when we list the clusters, I am the contact.
1> kubectl get clusters
2NAME CONTACTPERSON
3destinationaarhus-techblog Peter Brøndum
4destinationaarhus-techblog02 Peter Brøndum
It worked! (surprise) But let's check the logs of our webhook.
12020/10/28 12:54:31 No contact, Mutate me!
210.244.0.1 - - [28/Oct/2020:12:54:31 +0000] "POST /mutate?timeout=30s HTTP/1.1" 200 214
In the above, we see that our webhook was reached when we applied the cluster and that no Contact Person was set. From there, it responded with an AdmissionReview telling Kubernetes to mutate.
Final Words
As you can see from the above examples, it is quite easy to extend Kubernetes' functionality by creating your own custom resources. Even creating custom logic and behaviour of the resources is doable. And this does not have to be custom resources, it could be used to influence other key components in the cluster.
To be fair, there is quite a lot from our setup in BESTSELLER, I did not cover. But the basics on how we keep track of our clusters are there. Instead of assigning me as a contact on each and every cluster, which would be a pain, we call our CI/CD pipeline and mutate the status
, amongst other things, on the cluster resources in Kubernetes. This way, when a cluster is changed, our CI/CD will run a bunch of jobs to setup and configure the specific cluster. When finished the pipeline updates our custom cluster resource in Kubernetes once again and mutates the status
.
About the author
Peter Brøndum
My name is Peter Brøndum, I work as a Tech Lead and Scrum Master in a platform engineering team at BESTSELLER. Our main priority is building a development highway, with paved roads, lights and signs, so our colleagues can deliver value even faster. Besides working at BESTSELLER, I — amongst other things, am automating my own home, and yes, that is, of course, running on Kubernetes as well.