# Kubernetes cluster certificate management In Kubernetes, certificates expire after 1 year if the cluster has not been updated in that time. The following instructions are for vanilla Kubernetes deployed with kubeadm. Please follow relevant instructions if using other versions of Kubernetes (Rancher, Openshift etc) as there may be differences in the process. MetalSoft strongly recommend that the certificates are monitored from an external source. ## Checking if certificates have expired If the certificates expire, certain tasks will fail. As an example, running the following on one of the Kubernetes nodes: ``` kubectl get pods -A ``` Will result in the following error: ``` Unable to connect to the server: x509: certificate has expired or is not yet valid: current time 2022-03-23T14:32:50Z is after 2022-03-22T23:03:22Z ``` To check the certificates, run the following on the first node in the cluster: ``` kubeadm certs check-expiration ``` If the certificates have expired, you will receive a similar output to this: ``` CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED admin.conf Mar 22, 2022 23:03 UTC no apiserver Mar 22, 2022 23:03 UTC ca no apiserver-etcd-client Mar 22, 2022 23:03 UTC etcd-ca no apiserver-kubelet-client Mar 22, 2022 23:03 UTC ca no controller-manager.conf Mar 22, 2022 23:03 UTC no etcd-healthcheck-client Mar 22, 2022 23:03 UTC etcd-ca no etcd-peer Mar 22, 2022 23:03 UTC etcd-ca no etcd-server Mar 22, 2022 23:03 UTC etcd-ca no front-proxy-client Mar 22, 2022 23:03 UTC front-proxy-ca no scheduler.conf Mar 22, 2022 23:03 UTC no CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED ca Mar 20, 2031 23:03 UTC 8y no etcd-ca Mar 20, 2031 23:03 UTC 8y no front-proxy-ca Mar 20, 2031 23:03 UTC 8y no ``` ## Renewing Kubernetes certificate To renew the certificates, issue the following command on **all control-plane nodes** or etcd will fail: ``` kubeadm certs renew all ``` You should receive an output similar to this: ``` certificate embedded in the kubeconfig file for the admin to use and for kubeadm itself renewed certificate for serving the Kubernetes API renewed certificate the apiserver uses to access etcd renewed certificate for the API server to connect to kubelet renewed certificate embedded in the kubeconfig file for the controller manager to use renewed certificate for liveness probes to healthcheck etcd renewed certificate for etcd nodes to communicate with each other renewed certificate for serving etcd renewed certificate for the front proxy client renewed certificate embedded in the kubeconfig file for the scheduler manager to use renewed ``` Restart kubelet on all nodes: ``` systemctl daemon-reload && systemctl restart kubelet ``` Once this is complete, copy the `admin.conf` from `/etc/kubernetes` over the existing `~/.kube/confing/config` file: ``` cp /etc/kubernetes/admin.conf /root/.kube/config ``` The kubernetes pods will then need to be restarted using the following procedure. **Warning** "Static Pods are managed by the local kubelet and not by the API Server, thus kubectl cannot be used to delete and restart them. To restart a static Pod you can temporarily remove its manifest file from `/etc/kubernetes/manifests/` and wait for 20 seconds (see the `fileCheckFrequency` value in KubeletConfiguration struct. The kubelet will terminate the Pod if it's no longer in the manifest directory. You can then move the file back and after another `fileCheckFrequency` period, the kubelet will recreate the Pod and the certificate renewal for the component can complete." ``` mkdir -p /etc/kubernetes/_bak_manifests && mv /etc/kubernetes/manifests/* /etc/kubernetes/_bak_manifests/ && sleep 61 && mv /etc/kubernetes/_bak_manifests/* /etc/kubernetes/manifests/ ``` ## Check pods and certificate Then check the pods are all in running state: ``` k get pod ``` You should receive an output similar to this: ``` NAME READY STATUS RESTARTS AGE auth-microservice-76d58f8666-6f24b 1/1 Running 0 55d config-microservice-6cb6749b5-7h49r 1/1 Running 0 6d17h controller-d96445fbf-pvz4s 1/1 Running 0 55d couchdb-5cf5f9c6b4-xz9j5 1/1 Running 0 75d event-microservice-5994c7f59d-zj4n7 1/1 Running 1 6d17h gateway-api-74fffd489c-q5rpj 1/1 Running 2 6d17h kafka-5bbb5b6b54-9wz7g 1/1 Running 0 294d metal-cloud-ui-75b67dbdb9-2km7z 1/1 Running 0 55d mysql-86f84d5f7b-49rs9 1/1 Running 0 55d pdns-564dc7f7f4-wnwq8 1/1 Running 0 55d redis-5488cf8cb6-mnrhb 1/1 Running 0 55d repo-76cc854495-m4nk7 1/1 Running 0 55d traefik-poc-67db8598c-svs2p 1/1 Running 0 71d zookeeper-78cbb9749-cp25s 1/1 Running 0 55d ``` Check that the certificates are new renewed by issuing this command: ``` kubeadm certs check-expiration ``` Which should provide a similar output to the below. The expiry date should now be 1 year in the future: ``` CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED admin.conf Mar 23, 2023 19:27 UTC 364d no apiserver Mar 23, 2023 19:27 UTC 364d ca no apiserver-etcd-client Mar 23, 2023 19:27 UTC 364d etcd-ca no apiserver-kubelet-client Mar 23, 2023 19:27 UTC 364d ca no controller-manager.conf Mar 23, 2023 19:27 UTC 364d no etcd-healthcheck-client Mar 23, 2023 19:27 UTC 364d etcd-ca no etcd-peer Mar 23, 2023 19:27 UTC 364d etcd-ca no etcd-server Mar 23, 2023 19:27 UTC 364d etcd-ca no front-proxy-client Mar 23, 2023 19:27 UTC 364d front-proxy-ca no scheduler.conf Mar 23, 2023 19:27 UTC 364d no CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED ca Mar 20, 2031 23:03 UTC 8y no etcd-ca Mar 20, 2031 23:03 UTC 8y no front-proxy-ca Mar 20, 2031 23:03 UTC 8y no ``` **If using Calico as a CNI, ensure you also restart/recreate its node and controller pods**