Introduction
I’ve been administrating kubernetes clusters but I never got the chance to actually build one from scratch until recently.
There are multiple kubernetes distribution out there and multiple ways to configure the components that make a kubernetes cluster. If you don’t want any hassle you could opt for minikube or k3s but it would be a disservice to you in the future. Once problems appear later on you’ll need to go down to the nitty-gritty anyway. The cluster we’ll be building today has the following components and agents:
- A series of debian 12 machine with swap disabled
- Kubernetes v1.32.x: systemd services prefixed with “kube” i.e. kubectl kube-apiserver kube-controller-manager kube-scheduler kube-proxy kubelet
- Containerd v2.1.x: rfc compliant, industry-standard container runtime environment managed by systemd
- Cni v1.6.x: Which is a combination of nftables, linux bridge and a cgroupDriver. cgroup is a linux feature that controls process memory usage and permissions
- Etcd v3.6.x: Kubernetes components are stateless and store cluster state in a systemd etcd service
Installation
Fo through the bash scripts and inspect the commands in my github repository
- 1-download dependencies.sh
- 2-connectivity.sh
- 3-cacerts.sh
- 4-kubeconfig.sh
- 5-encryption.sh
- 6-etcd.sh
- 7-bootstrap control plane.sh
- 8-bootstrap workers.sh
- 9-kubectl on jumphost.sh
- 10-networking.sh
Some important concepts
Pods are ephemeral
A pod is an abstraction layer on top of the container runtime so that it becomes “vendor agnostic”. In practice you don’t deploy pods in k8s, you define a deployment which spawns a replicaset and the replicaset starts the pods.
The service abstraction
An svc (or service for us humans) is used to avoid having static ip addresses as a means of adressing pods. So when a database talks to a web server it should use hostname resolution provided by the svc. Since pods are ephemeral you’ll have to reassign IP addresses when they die so the solution is to use hostname adressing and letting the svc manage the Layer 3 IP address.
Configuration
Configmap allows you to abstract certain config variables away from the application so that you don’t need to rebuild the image in case a pod dies. e.g. database url. However if you want to map sensitive data it would be better to use secrets which utilize the encryption key generated earlier.
Persistance
Volumes attaches a physical storage to your pod so that you can persist data.
Routing
Routing is achieved with an ingress, with a port forward or with a NodePort. You’ll generally be using ingress to route traffic from the internet into your cluster in a production environment
Workers and Masters
Nodes are the worker nodes. Every node should have a kube-proxy, kubelet and container runtime installed in order to operate. Control planes or masters on the other hand have the following components:
- API server: validates requests (new deployment for example) and is the the api gateway for your cluster.
- Scheduler: in the case of a new deployment it assigns pods onto the nodes depending on the application requirements and forwards the request to the Kubelet who will actually do the job
- Controller manager: detects state changes like crashes and heals the state
- etcd: keyvalue store that stores your cluster’s brain. As explained earlier, kubernetes components are stateless so the cluster’s state is stored in a systemd etcd service
Useful commands to start and debug your pods
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
root@jumpbox:$ kubectl create deployment nginx-depl --image=nginx
root@jumpbox:$ kubectl edit deployment nginx-depl
root@jumpbox:$ kubectl get pod
root@jumpbox:$ kubectl exec -it {pod-name} -- bin/bash
root@jumpbox:$ kubectl logs {pod-name}
# if no logs appear:
root@jumpbox:$ kubectl describe {pod-name}
# you can also create a deployment with a manifest
root@jumpbox:$ nano nginx-deployment.yaml
root@jumpbox:$ kubectl apply -f nginx-deployment.yaml
root@jumpbox:$ kubectl get pod
root@jumpbox:$ kubectl get deployment