K3s, lightweight home kubernetes cluster
Setting up a high-availability K3s cluster
This is a guide to provision a K3s cluster in a high availability configuration.
The cluster is quick and easy to setup, although if you’re looking for the easiest method to setup the same type of cluster, I’d recommend checking out Techno Tim’s ansible playbook for k3s that will bootstrap a cluster in minutes.
Installation
1. Provision virtual machines running Ubuntu Server 22.04.3
Provision the number of virtual machines (or bare-metal servers) that you’d like to use for your cluster, with the minimum number being 3. I use proxmox to manage my homelab environment, so I’ll be using virtual machines. I’m going to provision 3 virtual machines, which will operate as my control planes, and nodes. I’m going to assume you’re comfortable installing Ubuntu Server, and will not go into detail on how to do so.
When creating a high-availability cluster, whether it’s with K3s or Kubernetes, the control-planes must be in an odd number
Generally I make my control-planes disk size around 30-50 GB, and my worker nodes around 20-30 GB. I also make sure to give the control-planes more RAM and CPU than the worker nodes.But if you are planning to use Longhorn for storage, you may want to give the worker nodes more disk space, as Longhorn will use the worker nodes for storage. If you’d like to know the general node requirements for K3s, you can check the official documentation
2. Update and prepare each server.
This should be done for each server.
Update the freshly installed machines
1
sudo apt update; sudo apt upgrade -y;
Reconfiure unattended-upgrades
1
sudo dpkg-reconfigure --priority=low unattended-upgrades
Verify unattended upgrades are enabled
1
sudo nano /etc/apt/apt.conf.d/20auto-upgrades
Ensure the file looks like this:
1 2
APT::Periodic::Update-Package-Lists "1"; APT::Periodic::Unattended-Upgrade "1";
Disable automatic reboots by editing the file:
1
sudo nano /etc/apt/apt.conf.d/50unattended-upgrades
and ensuring this line is commented out (remove the
//
at the beginning of the line)1
Unattended-Upgrade::Automatic-Reboot "false";
- Ensure the server has a static IP address
For the steps below Replace
*.yaml
with the file that contains your network configuration. If you’re unsure, you can check the contents of the directory withls /etc/netplan/
. Once you’ve made the changes, you can test them withsudo netplan try
. If the changes are successful, you can apply them withsudo netplan apply
.1
sudo nano /etc/netplan/*.yaml
Ensure the file looks similar to this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
network: ethernets: ens18: addresses: - 192.168.56.150/24 nameservers: addresses: - 1.1.1.1 - 1.0.0.1 search: [] routes: - to: default via: 192.168.56.1 version: 2
If using LVM for your storage, you may want to resize the root partition to use the entire disk. You can do this with the following commands:
1
sudo lvm
1
lvscan
Find your root partition, and resize it with:
1
lvextend -l +100%FREE /dev/ubuntu-vg/ubuntu-lv
Replace
/dev/ubuntu-vg/ubuntu-lv
with the path to your root partition.1
exit
Then resize the filesystem with:
1
sudo resize2fs /dev/vgubuntu-server/root
Set the hostname (optional)
1
sudo hostnamectl set-hostname k3s-control-plane-1
Set the timezone
1
sudo timedatectl set-timezone America/Chicago
Replace
America/Chicago
with your timezone.Setup the firewall (optional, but recommended)
1 2 3 4
sudo ufw default deny incoming sudo ufw default allow outgoing sudo ufw allow ssh sudo ufw enable
Setup fail2ban (optional, but recommended)
1 2 3
sudo apt install fail2ban -y sudo cp /etc/fail2ban/fail2ban.{conf,local} sudo cp /etc/fail2ban/jail.{conf,local}
Set the backend to
systemd
1
sudo nano /etc/fail2ban/jail.local
1
backend = systemd
Enable and start fail2ban
1 2
sudo systemctl enable fail2ban sudo systemctl start fail2ban
3. Prepare the control plane servers for K3s
We are going to be using KubeVIP to create a virtual IP for the control plane servers. This will allow us to use a single IP to access the control plane, and if one of the control plane servers goes down, the virtual IP will move to another server. This is a requirement for a high-availability cluster.
Install the Docker Engine
I won’t go into detail on how to install Docker in this guide, as it’s well documented on their website. I recommend using the official Docker documentation to install Docker.
Create the K3s manifests folder
1
sudo mkdir -p /var/lib/rancher/k3s/server/manifests/
Download the KubeVIP RBAC manifest
1
sudo curl https://kube-vip.io/manifests/rbac.yaml > kube-vip-rbac.yaml
Move the file to the K3s manifests folder
1
sudo mv kube-vip-rbac.yaml /var/lib/rancher/k3s/server/manifests
Generate the DaemonSet manifest for KubeVIP
It’s a good idea to check the official KubeVIP documentation for the latest instructions for generating the DaemonSet manifest. We are using the ARP method for this guide.
Export the IP address intended for the virtual IP (use an IP address that’s not in use on your network, and is within the same subnet as the control plane servers)
1
export VIP=192.168.1.45
Set the
INTERFACE
name to the name of the interface on the control plane(s) which will announce the VIP. In many Linux distributions this can be found with theip a
command.1
export INTERFACE=ens18
Get the latest version of the kube-vip release by parsing the GitHub API. This step requires that
jq
andcurl
are installed.1
KVVERSION=$(curl -sL https://api.github.com/repos/kube-vip/kube-vip/releases | jq -r ".[0].name")
Create the kube-vip alias
1
alias kube-vip="sudo ctr image pull ghcr.io/kube-vip/kube-vip:$KVVERSION; sudo ctr run --rm --net-host ghcr.io/kube-vip/kube-vip:$KVVERSION vip /kube-vip"
I’ve added sudo to the alias command before each instance of
ctr
, as I have not configured my user to run docker commands without sudo. If you have, you can remove thesudo
from the alias command.Generate the DaemonSet manifest
1 2 3 4 5 6 7 8 9
kube-vip manifest daemonset \ --interface $INTERFACE \ --address $VIP \ --inCluster \ --taint \ --controlplane \ --services \ --arp \ --leaderElection > kube-vip-daemonset.yaml
I’ve written the output to a file called
kube-vip-daemonset.yaml
by adding> kube-vip-daemonset.yaml
but you can name it whatever you’d like.Move the file to the K3s manifests folder
1
sudo mv kube-vip-daemonset.yaml /var/lib/rancher/k3s/server/manifests
4. Install K3s on the control plane servers
Run the K3s install script on the first control plane server
1
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION={VERSION} K3S_TOKEN={TOKEN} sh -s - server --flannel-iface={INTERFACE} --disable servicelb --disable traefik --node-taint node-role.kubernetes.io/master=true:NoSchedule --tls-san {KUBE_VIP_IP} --cluster-init --write-kubeconfig-mode 644 --node-ip {NODE_IP} --kube-controller-manager-arg bind-address=0.0.0.0 --kube-proxy-arg metrics-bind-address=0.0.0.0 --kube-scheduler-arg bind-address=0.0.0.0 --etcd-expose-metrics true --kubelet-arg containerd=/run/k3s/containerd/containerd.sock
- Replace
{VERSION}
with the version of K3s you’d like to install. You can find the latest version on the K3s GitHub releases page, the version should be in the formatv1.29.1+k3s1
. - Replace
{TOKEN}
with a token random string of characters, it cannot contain any special characters. This token will be used to join the control plane servers to the cluster. you can useopenssl rand -hex 64
to generate a random string. -s
is used to run the script in silent mode.server
is used to install K3s as a server.--flanel-iface={INTERFACE}
is the interface that flannel will use for the overlay network. Replace{INTERFACE}
with the name of the interface on your server.-- disable servicelb
and--disable traefik
are used because we are using KubeVIP for the virtual IP, and we will be using MetalLB for the load balancer.--node-taint node-role.kubernetes.io/master=true:NoSchedule
is used to prevent workloads from being scheduled on the control plane servers. if you’d like to schedule workloads on the control plane servers, you can remove this flag.--tls-san {KUBE_VIP_IP}
value should be the IP address of the virtual IP set by KubeVIP.--cluster-init
is used to initialize the cluster on the first control plane server.--write-kubeconfig-mode 644
is used to set the permissions on the kubeconfig file.--node-ip {NODE_IP}
is the IP address of the server. Replace{NODE_IP}
with the IP address of the server.--kube-controller-manager-arg bind-address=0.0.0.0
is used to bind the controller manager to all interfaces. This is required for the virtual IP to work.--kube-proxy-arg metrics-bind-address=0.0.0.0
is used to bind the kube-proxy to all interfaces. This is required for metrics to be exposed.--kube-scheduler-arg bind-address=0.0.0.0
is used to bind the scheduler to all interfaces. This is required for the virtual IP to work.--etcd-expose-metrics true
is used to expose etcd metrics.--kubelet-arg containerd=/run/k3s/containerd/containerd.sock
is used to set the container runtime to containerd.
- Replace
Verify the K3s server is running
1
sudo k3s kubectl get nodes
You should see the control plane server listed as a node.
Run the K3s install script on the remaining control plane servers with the same token, so the command will look like this:
1
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION={VERSION} K3S_TOKEN={TOKEN} sh -s - server --flannel-iface={INTERFACE} --disable servicelb --disable traefik --node-taint node-role.kubernetes.io/master=true:NoSchedule --tls-san {KUBE_VIP_IP} --write-kubeconfig-mode 644 --node-ip {NODE_IP} --server https://{KUBE_VIP_IP}:6443--kube-controller-manager-arg bind-address=0.0.0.0 --kube-proxy-arg metrics-bind-address=0.0.0.0 --kube-scheduler-arg bind-address=0.0.0.0 --etcd-expose-metrics true --kubelet-arg containerd=/run/k3s/containerd/containerd.sock
- Remove
--cluster-init
flag, as the remaining control plane servers will join the cluster, not initialize it. --node-ip {NODE_IP}
is the IP address of the server. Replace{NODE_IP}
with the IP address of the currently used control-plane server.- Add
--server https://{KUBE_VIP_IP}:6443
to the end of the command, replacing{KUBE_VIP_IP}
with the IP address of the KubeVIP virtual IP. - Remove
--tls-san {KUBE_VIP_IP}
as it’s not needed for the remaining control plane servers.
- Remove
Verify the other control plane servers have joined the cluster by running this command on the first control-plane.
1
sudo k3s kubectl get nodes
You should see the control plane servers listed.
Copy the kubeconfig file from the first control plane server to your local machine
1
sudo cat /etc/rancher/k3s/k3s.yaml
Copy the contents of the file to your local machine into a file located at
~/.kube/config
1
nano ~/.kube/config
Before saving the file, replace the
server
value with the virtual IP address of KubeVIP that you set earlier, changing it fromhttps://127.0.0.1:6443
tohttps://{KUBE_VIP_IP}:6443
.
5. Connect the worker nodes to the cluster
Run the K3s install script on the worker nodes
If you plan to use Longhorn storage you can install the dependencies now.
1
sudo apt install nfs-common open-iscsi -y; sudo systemctl enable open-iscsi --now
Otherwise, continue with the K3s install script
1
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION={VERSION} K3S_TOKEN={TOKEN} K3S_URL=https://{KUBE_VIP_IP}:6443 sh -s - --node-ip {NODE_IP} --flannel-iface={INTERFACE}
- Replace
{VERSION}
with the version of K3s you installed on the control plane servers. - Replace
{TOKEN}
with the token you used to join the control plane servers to the cluster. - Replace
{KUBE_VIP_IP}
with the IP address of the virtual IP set by KubeVIP. - Replace
{NODE_IP}
with the IP address of the worker node. - Replace
{INTERFACE}
with the name of the interface on the worker node.
- Replace
Verify the worker nodes have joined the cluster by running this command on the first control-plane, or with
kubectl
on your local machine with the copied kubeconfig.1
sudo k3s kubectl get nodes
1
sudo kubectl get nodes
You should see the worker nodes listed.
6. Install MetalLB
MetalLB is a load balancer that will allow us to expose services to the network from our on-premises cluster. We will use MetalLB to expose any services we’d like to access from our network.
Apply the MetalLB manifest
1
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.14.3/config/manifests/metallb-native.yaml
You can install the latest version of MetalLB by checking the official documentation
Define an address pool for MetalLB to use for assigning IP Addresses. This should be a set of IP Addresses that is outside of your DHCP server scope so other devices on your network don’t accidentally get assigned to any of these addresses. You can check the official documentation for more information on how to do this: Configuring MetalLB
1
nano metallb_pool.yaml
insert the following, and modify as needed to suit your network
1 2 3 4 5 6 7 8
apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: first-pool namespace: metallb-system spec: addresses: - 192.168.9.1-192.168.9.5
apply the manifest
1
sudo kubectl apply -f metallb_pool.yaml
Now we will create an L2Advertisement so we can use these IP addresses on our network. We are setting up Layer 2 mode, which is the easiest to setup, but you can use BGP if you’d like. You can check the official documentation for more information on how to do this: Configuring MetalLB
1
nano metallb_l2.yaml
insert the following
1 2 3 4 5
apiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: example namespace: metallb-system
apply the manifest
1
sudo kubectl apply -f metallb_l2.yaml
7. Install Cert-Manager
Cert-Manager is a Kubernetes add-on to automate the management and issuance of TLS certificates from various issuing sources. It will ensure that our services are secure by providing them with TLS certificates. I’ll be using the official documentation to install Cert-Manager, and using Helm to install it. If you don’t have Helm installed, you can install it by following the official documentation.
Add the Jetstack Helm repository
1
helm repo add jetstack https://charts.jetstack.io --force-update
Update the Helm repositories
1
helm repo update
Install Cert-Manager
1 2 3 4 5 6
helm install \ cert-manager jetstack/cert-manager \ --namespace cert-manager \ --create-namespace \ --version v1.14.3 \ --set installCRDs=true
8. Install Longhorn
Longhorn is a distributed block storage system for Kubernetes. It’s a great way to provide persistent storage for your workloads. I’ll be using the official documentation to install Longhorn, and using Helm to install it. If you don’t have Helm installed, you can install it by following the official documentation.
Add the Longhorn Helm repository
1
helm repo add longhorn https://charts.longhorn.io
Update the Helm repositories
1
helm repo update
Install Longhorn
1
helm install longhorn longhorn/longhorn --namespace longhorn-system --create-namespace --version 1.6.0
9. Install Prometheus for cluster monitoring
Prometheus is a monitoring and alerting toolkit that is used to monitor the health of your cluster, we’ll be installing the community Helm chart for Prometheus. for this section, I’d recommend checking out Techno Tim’s guide on installing Prometheus with Helm, as it’s a great guide that I’ve used in the past. You can find it here.
Conclusion
You should now have a high-availability K3s cluster that is ready to use. You can now deploy workloads to the cluster, and expose them to your network using MetalLB, and secure them with Cert-Manager. You can also use Longhorn to provide persistent storage for your workloads, and monitor the health of your cluster with Prometheus.
Thanks for reading, and I hope this guide was helpful to you.