Home k8s part 1 a primer on kubernetes
Post
Cancel

k8s part 1 a primer on kubernetes

Introduction

I’ve been administrating kubernetes clusters but I never got the chance to actually build one from scratch until recently.

There are multiple kubernetes distribution out there and multiple ways to configure the components that make a kubernetes cluster. If you don’t want any hassle you could opt for minikube or k3s but it would be a disservice to you in the future. Once problems appear later on you’ll need to go down to the nitty-gritty anyway. The cluster we’ll be building today has the following components and agents:

  • A series of debian 12 machine with swap disabled
  • Kubernetes v1.32.x: systemd services prefixed with “kube” i.e. kubectl kube-apiserver kube-controller-manager kube-scheduler kube-proxy kubelet
  • Containerd v2.1.x: rfc compliant, industry-standard container runtime environment managed by systemd
  • Cni v1.6.x: Which is a combination of nftables, linux bridge and a cgroupDriver. cgroup is a linux feature that controls process memory usage and permissions
  • Etcd v3.6.x: Kubernetes components are stateless and store cluster state in a systemd etcd service

Installation

Fo through the bash scripts and inspect the commands in my github repository

Some important concepts

Pods are ephemeral

A pod is an abstraction layer on top of the container runtime so that it becomes “vendor agnostic”. In practice you don’t deploy pods in k8s, you define a deployment which spawns a replicaset and the replicaset starts the pods.

The service abstraction

An svc (or service for us humans) is used to avoid having static ip addresses as a means of adressing pods. So when a database talks to a web server it should use hostname resolution provided by the svc. Since pods are ephemeral you’ll have to reassign IP addresses when they die so the solution is to use hostname adressing and letting the svc manage the Layer 3 IP address.

Configuration

Configmap allows you to abstract certain config variables away from the application so that you don’t need to rebuild the image in case a pod dies. e.g. database url. However if you want to map sensitive data it would be better to use secrets which utilize the encryption key generated earlier.

Persistance

Volumes attaches a physical storage to your pod so that you can persist data.

Routing

Routing is achieved with an ingress, with a port forward or with a NodePort. You’ll generally be using ingress to route traffic from the internet into your cluster in a production environment

Workers and Masters

Nodes are the worker nodes. Every node should have a kube-proxy, kubelet and container runtime installed in order to operate. Control planes or masters on the other hand have the following components:

  • API server: validates requests (new deployment for example) and is the the api gateway for your cluster.
  • Scheduler: in the case of a new deployment it assigns pods onto the nodes depending on the application requirements and forwards the request to the Kubelet who will actually do the job
  • Controller manager: detects state changes like crashes and heals the state
  • etcd: keyvalue store that stores your cluster’s brain. As explained earlier, kubernetes components are stateless so the cluster’s state is stored in a systemd etcd service

Useful commands to start and debug your pods

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
root@jumpbox:$ kubectl create deployment nginx-depl --image=nginx

root@jumpbox:$ kubectl edit deployment nginx-depl

root@jumpbox:$ kubectl get pod

root@jumpbox:$ kubectl exec -it {pod-name} -- bin/bash

root@jumpbox:$ kubectl logs {pod-name}

# if no logs appear:

root@jumpbox:$ kubectl describe {pod-name}

# you can also create a deployment with a manifest

root@jumpbox:$ nano nginx-deployment.yaml

root@jumpbox:$ kubectl apply -f nginx-deployment.yaml 

root@jumpbox:$ kubectl get pod

root@jumpbox:$ kubectl get deployment

Kubernetes checklist

Should the service run on kubernetes?

  1. Is high availability needed? Regarding the first point, you’ll need to dig into a set of questions in order to understand if the kubernete’s scale makes sense for your need. If the app was turned off for maintenance are you able to quantify the lost revenue? are there any opportunity cost that you could attribute to the downtime? or is it brand related? having software that is always there when you need it strenghtens the brand image but perhaps no one would notice the downtime if the traffic is very low.

  2. Is the traffic very variable Kubernetes has built in auto scaling features that would be highly desirable for websites like amazon that get a sudden surge in usage during christmas or black friday for example

  3. Is the application stateless? A stateless application is an ideal candidate for a pod in kubernetes because they are first class citizens and immutable.

  4. Is your team confortable with linux, networking, docker, k8s, observability? K8s can be very finicky to configure, highly configurable, has high hardware requirements and has no guard rails which makes it very hard to administer

  5. Is the app strategic to the company? As a developer we tend to want to make everything in-house but as the system administrator knows, it has a high maintenance cost to keep up with backups, updates and security configuration. If you’re not making any money out of the software that you are producing it’s easier to disregard it as cost center and pay for a saas instead.

  6. Is the data running on it strategic for your company? Microsoft does not guarantee backups on the microsoft 365 for example. That data could also get lost one day if there is a fire at a microsoft datacenter or something else goes wrong. There is also the security factor with microsoft. We’ve seen countless stories online of architecture flaws which lead to microsoft vulnerabilities e.g. https://cybersecuritynews.com/microsoft-teams-guest-chat-vulnerability/

This post is licensed under CC BY 4.0 by the author.