Score:1

Pod coredns stuck in ContainerCreating state with Weave on k8s

us flag

First of all, let me thank you for this amazing guide. I'm very new to kubernetes and having a guide like this to follow helps a lot when trying to setup my first cluster!

That said, I'm having some issues with creating deploytments, as there are two pods that aren't being created, and remain stuck in the state: ContainerCreating

[root@master ~]# kubectl get nodes
NAME     STATUS   ROLES           AGE   VERSION
master   Ready    control-plane   25h   v1.24.0
node1    Ready    <none>          24h   v1.24.0
node2    Ready    <none>          24h   v1.24.0


[root@master ~]# kubectl cluster-info
Kubernetes control plane is running at https://192.168.3.200:6443
CoreDNS is running at https://192.168.3.200:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

The problem:

[root@master ~]# kubectl get all --all-namespaces
NAMESPACE     NAME                                 READY   STATUS              RESTARTS        AGE
kube-system   pod/coredns-6d4b75cb6d-v5pvk         0/1     ContainerCreating   0               114m
kube-system   pod/coredns-7599c5f99f-q6nwq         0/1     ContainerCreating   0               114m
kube-system   pod/coredns-7599c5f99f-sg4wn         0/1     ContainerCreating   0               114m
kube-system   pod/etcd-master                      1/1     Running             3 (3h26m ago)   25h
kube-system   pod/kube-apiserver-master            1/1     Running             3 (3h26m ago)   25h
kube-system   pod/kube-controller-manager-master   1/1     Running             3 (3h26m ago)   25h
kube-system   pod/kube-proxy-ftxzx                 1/1     Running             2 (3h11m ago)   24h
kube-system   pod/kube-proxy-pcl8q                 1/1     Running             3 (3h26m ago)   25h
kube-system   pod/kube-proxy-q7dpw                 1/1     Running             2 (3h23m ago)   24h
kube-system   pod/kube-scheduler-master            1/1     Running             3 (3h26m ago)   25h
kube-system   pod/weave-net-2p47z                  2/2     Running             5 (3h23m ago)   24h
kube-system   pod/weave-net-k5529                  2/2     Running             4 (3h11m ago)   24h
kube-system   pod/weave-net-tq4bs                  2/2     Running             7 (3h26m ago)   25h

NAMESPACE     NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
default       service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP                  25h
kube-system   service/kube-dns     ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   25h

NAMESPACE     NAME                        DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
kube-system   daemonset.apps/kube-proxy   3         3         3       3            3           kubernetes.io/os=linux   25h
kube-system   daemonset.apps/weave-net    3         3         3       3            3           <none>                   25h

NAMESPACE     NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
kube-system   deployment.apps/coredns   0/2     2            0           25h

NAMESPACE     NAME                                 DESIRED   CURRENT   READY   AGE
kube-system   replicaset.apps/coredns-6d4b75cb6d   1         1         0       25h
kube-system   replicaset.apps/coredns-7599c5f99f   2         2         0       116m

Note that the first three pods, from coredns, fail to start.

[root@master ~]# kubectl get events
LAST SEEN   TYPE      REASON                   OBJECT                             MESSAGE
93m         Warning   FailedCreatePodSandBox   pod/nginx-deploy-99976564d-s4shk   (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "fd79c77289f42b3cb0eb0be997a02a42f9595df061deb6e2d3678ab00afb5f67": failed to find network info for sandbox "fd79c77289f42b3cb0eb0be997a02a42f9595df061deb6e2d3678ab00afb5f67"

.

   [root@master ~]# kubectl describe pod coredns-6d4b75cb6d-v5pvk -n kube-system
    Name:                 coredns-6d4b75cb6d-v5pvk
    Namespace:            kube-system
    Priority:             2000000000
    Priority Class Name:  system-cluster-critical
    Node:                 node2/192.168.3.202
    Start Time:           Thu, 12 May 2022 19:45:58 +0000
    Labels:               k8s-app=kube-dns
                          pod-template-hash=6d4b75cb6d
    Annotations:          <none>
    Status:               Pending
    IP:
    IPs:                  <none>
    Controlled By:        ReplicaSet/coredns-6d4b75cb6d
    Containers:
      coredns:
        Container ID:
        Image:         k8s.gcr.io/coredns/coredns:v1.8.6
        Image ID:
        Ports:         53/UDP, 53/TCP, 9153/TCP
        Host Ports:    0/UDP, 0/TCP, 0/TCP
        Args:
          -conf
          /etc/coredns/Corefile
        State:          Waiting
          Reason:       ContainerCreating
        Ready:          False
        Restart Count:  0
        Limits:
          memory:  170Mi
        Requests:
          cpu:        100m
          memory:     70Mi
        Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
        Readiness:    http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
        Environment:  <none>
        Mounts:
          /etc/coredns from config-volume (ro)
          /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4bpvz (ro)
    Conditions:
      Type              Status
      Initialized       True
      Ready             False
      ContainersReady   False
      PodScheduled      True
    Volumes:
      config-volume:
        Type:      ConfigMap (a volume populated by a ConfigMap)
        Name:      coredns
        Optional:  false
      kube-api-access-4bpvz:
        Type:                    Projected (a volume that contains injected data from multiple sources)
        TokenExpirationSeconds:  3607
        ConfigMapName:           kube-root-ca.crt
        ConfigMapOptional:       <nil>
        DownwardAPI:             true
    QoS Class:                   Burstable
    Node-Selectors:              kubernetes.io/os=linux
    Tolerations:                 CriticalAddonsOnly op=Exists
                                 node-role.kubernetes.io/control-plane:NoSchedule
                                 node-role.kubernetes.io/master:NoSchedule
                                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
    Events:
      Type     Reason                  Age                   From     Message
      ----     ------                  ----                  ----     -------
      Warning  FailedCreatePodSandBox  93s (x393 over 124m)  kubelet  (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "7d0f8f4b3dbf2dffcf1a8c01b41368e16b1f80bc97ff3faa611c1fd52c0f6967": failed to find network info for sandbox "7d0f8f4b3dbf2dffcf1a8c01b41368e16b1f80bc97ff3faa611c1fd52c0f6967"

Versions:

    [root@master ~]# docker --version
    Docker version 20.10.15, build fd82621
    
    
    [root@master ~]# kubelet --version
    Kubernetes v1.24.0

    [root@master ~]# kubeadm version
    kubeadm version: &version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.0",                 GitCommit:"4ce5a8954017644c5420bae81d72b09b735c21f0", GitTreeState:"clean", BuildDate:"2022-05-03T13:44:24Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/amd64"}

I have no idea where to go from here. I googled keywords like "rpc error weave k8s" and "Failed to create pod sandbox: rpc error" but none of the solutions I found had a solution to my problem. I saw some problems mentioning weaving net, could this be the problem? Maybe I got it wrong, but I'm sure I followed the instructions very well.

Any help would be greatly appreciated!

in flag
You'll want to check the logs for those `weave-net` pods, focusing especially on the one on `node2/192.168.3.202`; similarly, you'll want to check the `kubelet` logs on that node to similarly ensure everything is going ok (usually: `journalctl -u kubelet`)
Clebson avatar
us flag
Tha latest error are: prober_manager.go:255] "Failed to trigger a manual run" probe="Readiness"
Clebson avatar
us flag
It doesn't seem to have anything related to the specific pod. Also when I try to run this `kubectl logs --namespace=kube-system -l k8s-app=kube-dns` command it returns this: Error from server (BadRequest): container "coredns" in pod "coredns-7599c5f99f-95fgx" is waiting to start: ContainerCreating
in flag
We're not interested in what coredns thinks, as those failures are downstream from the real problem; your CNI on those nodes is inoperative but that could be for a seemingly infinite number of reasons. I'm also going to bet $1 that is not the only error in your kubelet logs, and you didn't hear the importance of my request to investigate why weave is mad. Regrettably, stack exchange sites are terrible for back-and-forth debugging, as only you can read the logs on your machines
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.