This is where investigation started: CoreDNS couldn't work for more that a couple of seconds, giving the following errors:
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
ingress-nginx ingress-nginx-controller-8xcl9 1/1 Running 0 11h
ingress-nginx ingress-nginx-controller-hwhvk 1/1 Running 0 11h
ingress-nginx ingress-nginx-controller-xqdqx 1/1 Running 2 (10h ago) 11h
kube-system calico-kube-controllers-684bcfdc59-cr7hr 1/1 Running 0 11h
kube-system calico-node-62p58 1/1 Running 2 (10h ago) 11h
kube-system calico-node-btvdh 1/1 Running 0 11h
kube-system calico-node-q5bkr 1/1 Running 0 11h
kube-system coredns-8474476ff8-dnt6b 0/1 CrashLoopBackOff 1 (3s ago) 5s
kube-system coredns-8474476ff8-ftcbx 0/1 Error 1 (2s ago) 5s
kube-system dns-autoscaler-5ffdc7f89d-4tshm 1/1 Running 2 (10h ago) 11h
kube-system kube-apiserver-hyzio 1/1 Running 4 (10h ago) 11h
kube-system kube-controller-manager-hyzio 1/1 Running 4 (10h ago) 11h
kube-system kube-proxy-2d8ls 1/1 Running 0 11h
kube-system kube-proxy-c6c4l 1/1 Running 4 (10h ago) 11h
kube-system kube-proxy-nzqdd 1/1 Running 0 11h
kube-system kube-scheduler-hyzio 1/1 Running 5 (10h ago) 11h
kube-system kubernetes-dashboard-548847967d-66dwz 1/1 Running 0 11h
kube-system kubernetes-metrics-scraper-6d49f96c97-r6dz2 1/1 Running 0 11h
kube-system nginx-proxy-dyzio 1/1 Running 0 11h
kube-system nginx-proxy-zyzio 1/1 Running 0 11h
kube-system nodelocaldns-g9wxh 1/1 Running 0 11h
kube-system nodelocaldns-j2qc9 1/1 Running 4 (10h ago) 11h
kube-system nodelocaldns-vk84j 1/1 Running 0 11h
kube-system registry-j5prk 1/1 Running 0 11h
kube-system registry-proxy-5wbhq 1/1 Running 0 11h
kube-system registry-proxy-77lqd 1/1 Running 0 11h
kube-system registry-proxy-s45p4 1/1 Running 2 (10h ago) 11h
kubectl describe
on that pod didn't bring much to the picture:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 67s default-scheduler Successfully assigned kube-system/coredns-8474476ff8-dnt6b to zyzio
Normal Pulled 25s (x4 over 68s) kubelet Container image "k8s.gcr.io/coredns/coredns:v1.8.0" already present on machine
Normal Created 25s (x4 over 68s) kubelet Created container coredns
Normal Started 25s (x4 over 68s) kubelet Started container coredns
Warning BackOff 6s (x11 over 66s) kubelet Back-off restarting failed container
But viewing logs did:
$ kubectl logs coredns-8474476ff8-dnt6b -n kube-system
.:53
[INFO] plugin/reload: Running configuration MD5 = 5b233a0166923d642fdbca0794b712ab
CoreDNS-1.8.0
linux/amd64, go1.15.3, 054c9ae
[FATAL] plugin/loop: Loop (127.0.0.1:49048 -> :53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 2906344495550081187.9117452939332601176."
It's great that troubleshooting documentation was linked! I started browsing that page and discovered, that indeed my /etc/resolv.conf
contained problematic local IP nameserver 127.0.0.53
.
Also, I found real DNS IPs in /run/systemd/resolve/resolv.conf
, but the question now is: how to perform the action described in the troubleshooting documentation, saying:
Add the following to your kubelet config yaml: resolvConf: (or via command line flag --resolv-conf deprecated in 1.10). Your “real” resolv.conf is the one that contains the actual IPs of your upstream servers, and no local/loopback address. This flag tells kubelet to pass an alternate resolv.conf to Pods. For systems using systemd-resolved, /run/systemd/resolve/resolv.conf is typically the location of the “real” resolv.conf, although this can be different depending on your distribution.
So, the questions are:
- how to find or where to create mentioned kubelet config yaml,
- at what level should I specify the
resolvConf
value, and
- can it accept multiple values? I have two nameservers defined. Should they be given as separate entries or an array?