I try to init a kubernetes master node running on a Debian GNU/Linux 11 (bullseye) system with kubeadm version 1.25.4-00.
I followed the official guideline on kubernetes.io. I installed containerd
and have set SystemdCgroup = true
in /etc/containerd/config.toml
.
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
runtime_engine = ""
runtime_root = ""
privileged_without_host_devices = false
base_runtime_spec = ""
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
containerd seems to be fine:
$ sudo systemctl status containerd
● containerd.service - containerd container runtime
Loaded: loaded (/lib/systemd/system/containerd.service; enabled; vendor preset: enabled)
Active: active (running) since Mon 2022-11-21 08:12:35 UTC; 1min 7s ago
Docs: https://containerd.io
Main PID: 7897 (containerd)
Tasks: 8
Memory: 10.5M
CPU: 470ms
CGroup: /system.slice/containerd.service
└─7897 /usr/bin/containerd
Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.900148031Z" level=info msg=serving... address=/run/containerd/containerd.sock.ttrpc
Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.900245191Z" level=info msg=serving... address=/run/containerd/containerd.sock
Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.900338622Z" level=info msg="containerd successfully booted in 0.046780s"
Nov 21 08:12:35 master-1 systemd[1]: Started containerd container runtime.
Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.909836633Z" level=info msg="Start subscribing containerd event"
Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.909931756Z" level=info msg="Start recovering state"
Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.910044670Z" level=info msg="Start event monitor"
Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.910056885Z" level=info msg="Start snapshots syncer"
Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.910069145Z" level=info msg="Start cni network conf syncer"
Nov 21 08:12:35 master-1 containerd[7897]: time="2022-11-21T08:12:35.910079607Z" level=info msg="Start streaming server"
....
When I run kubeadm init the system hangs and timeouts after 4 minutes:
$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16 -v=9
There seems to be no firewall issue and kubeadm seems to detect the containerd and the cgroups correctly:
I1121 08:16:46.935270 8096 initconfiguration.go:117] detected and using CRI socket: unix:///var/run/containerd/containerd.sock
I1121 08:16:46.935936 8096 interface.go:432] Looking for default routes with IPv4 addresses
I1121 08:16:46.936037 8096 interface.go:437] Default route transits interface "eth0"
I1121 08:16:46.936268 8096 interface.go:209] Interface eth0 is up
I1121 08:16:46.936427 8096 interface.go:257] Interface "eth0" has 3 addresses :[x.x.y.y/32 .........::1/64 ......../64].
I1121 08:16:46.936525 8096 interface.go:224] Checking addr x.x.y.y/32.
I1121 08:16:46.936596 8096 interface.go:231] IP found x.x.y.y
I1121 08:16:46.936616 8096 interface.go:263] Found valid IPv4 address x.x.y.y for interface "eth0".
I1121 08:16:46.936710 8096 interface.go:443] Found active IP x.x.y.y
I1121 08:16:46.936803 8096 kubelet.go:218] the value of KubeletConfiguration.cgroupDriver is empty; setting it to "systemd"
I1121 08:16:46.948350 8096 version.go:186] fetching Kubernetes version from URL: https://dl.k8s.io/release/stable-1.txt
I1121 08:16:47.327247 8096 version.go:255] remote version is much newer: v1.25.4; falling back to: stable-1.24
I1121 08:16:47.327368 8096 version.go:186] fetching Kubernetes version from URL: https://dl.k8s.io/release/stable-1.25.txt
[init] Using Kubernetes version: v1.25.4
[preflight] Running pre-flight checks
I1121 08:16:47.716620 8096 checks.go:570] validating Kubernetes and kubeadm version
I1121 08:16:47.716770 8096 checks.go:170] validating if the firewall is enabled and active
I1121 08:16:47.731470 8096 checks.go:205] validating availability of port 6443
I1121 08:16:47.732017 8096 checks.go:205] validating availability of port 10259
....
The following warning shows up when waiting for the kubelet to boot. This message is shown until the timeout after 4 minutes:
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
I1121 08:17:12.320743 8096 round_trippers.go:466] curl -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubeadm/v1.25.4 (linux/amd64) kubernetes/fdc7750" 'https://x.x.y.y:6443/healthz?timeout=10s'
I1121 08:17:12.321047 8096 round_trippers.go:508] HTTP Trace: Dial to tcp:x.x.y.y:6443 failed: dial tcp x.x.y.y:6443: connect: connection refused
I1121 08:17:12.321112 8096 round_trippers.go:553] GET https://x.x.y.y:6443/healthz?timeout=10s in 0 milliseconds
I1121 08:17:12.321157 8096 round_trippers.go:570] HTTP Statistics: DNSLookup 0 ms Dial 0 ms TLSHandshake 0 ms Duration 0 ms
I1121 08:17:12.321209 8096 round_trippers.go:577] Response Headers:
I1121 08:17:12.821526 8096 round_trippers.go:466] curl -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubeadm/v1.25.4 (linux/amd64) kubernetes/fdc7750" 'https://x.x.y.y:6443/healthz?timeout=10s'
I1121 08:17:12.821882 8096 round_trippers.go:508] HTTP Trace: Dial to tcp:x.x.y.y:6443 failed: dial tcp x.x.y.y:6443: connect: connection refused
.....
Checking the kublet status shows:
$ sudo systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Mon 2022-11-21 08:17:12 UTC; 4min 30s ago
Docs: https://kubernetes.io/docs/home/
Main PID: 8228 (kubelet)
Tasks: 14 (limit: 4556)
Memory: 52.0M
CPU: 6.246s
CGroup: /system.slice/kubelet.service
└─8228 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-ru>
Nov 21 08:21:42 master-1 kubelet[8228]: E1121 08:21:42.526642 8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:21:42 master-1 kubelet[8228]: E1121 08:21:42.626872 8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:21:42 master-1 kubelet[8228]: E1121 08:21:42.727919 8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:21:42 master-1 kubelet[8228]: E1121 08:21:42.829055 8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:21:42 master-1 kubelet[8228]: E1121 08:21:42.930002 8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:21:42 master-1 kubelet[8228]: E1121 08:21:42.959961 8228 eviction_manager.go:254] "Eviction manager: failed to get summary stats" err="failed to get node info: node \"master->
Nov 21 08:21:43 master-1 kubelet[8228]: E1121 08:21:43.029432 8228 kubelet.go:2349] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady >
Nov 21 08:21:43 master-1 kubelet[8228]: E1121 08:21:43.030749 8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:21:43 master-1 kubelet[8228]: E1121 08:21:43.130874 8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:21:43 master-1 kubelet[8228]: E1121 08:21:43.231537 8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Checking the journalctl shows:
$ sudo journalctl -xeu kubelet
Nov 21 08:22:37 master-1 kubelet[8228]: E1121 08:22:37.585238 8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:22:37 master-1 kubelet[8228]: E1121 08:22:37.685464 8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:22:37 master-1 kubelet[8228]: E1121 08:22:37.786279 8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:22:37 master-1 kubelet[8228]: E1121 08:22:37.887211 8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:22:37 master-1 kubelet[8228]: E1121 08:22:37.987526 8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:22:38 master-1 kubelet[8228]: E1121 08:22:38.045350 8228 kubelet.go:2349] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady >
Nov 21 08:22:38 master-1 kubelet[8228]: E1121 08:22:38.088201 8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
....
Nov 21 08:22:40 master-1 kubelet[8228]: E1121 08:22:40.500610 8228 controller.go:144] failed to ensure lease exists, will retry in 7s, error: Get "https://x.x.y.y:6443/apis/coordin>
Nov 21 08:22:40 master-1 kubelet[8228]: E1121 08:22:40.512026 8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:22:40 master-1 kubelet[8228]: E1121 08:22:40.613041 8228 kubelet.go:2424] "Error getting node" err="node \"master-1\" not found"
Nov 21 08:22:40 master-1 kubelet[8228]: I1121 08:22:40.700243 8228 kubelet_node_status.go:70] "Attempting to register node" node="master-1"
Nov 21 08:22:40 master-1 kubelet[8228]: E1121 08:22:40.701021 8228 kubelet_node_status.go:92] "Unable to register node with API server" err="Post \"https://x.x.y.y:6443/api/v1/node>
...
.....
How can I figure out the root of this issue? The log files did not really provide me any useful hints.
Note: If I install CRI-O instead of containerd kubeadm works like a charm.