Score:1

setup k8s on bare-metal with calico

gb flag

I'm trying to set up a k8s cluster for learning and testing before I move on to a production system.

I've set up my k8s cluster on bare metal in Debian 11

after install I can run:

$ kubectl get nodes -A
NAME   STATUS   ROLES           AGE   VERSION
km1    Ready    control-plane   22m   v1.26.2
kw1    Ready    worker          21m   v1.26.2

looks good to me. However when I run:

$ kubectl get pods -A
NAMESPACE     NAME                                      READY   STATUS                  RESTARTS        AGE
kube-system   calico-kube-controllers-57b57c56f-rp47v   1/1     Running                 0               12m
kube-system   calico-node-m4bsl                         0/1     Init:CrashLoopBackOff   6 (2m54s ago)   8m39s
kube-system   calico-node-tzcp7                         1/1     Running                 0               12m
kube-system   coredns-787d4945fb-cldh2                  1/1     Running                 0               12m
kube-system   coredns-787d4945fb-pcpx8                  1/1     Running                 0               12m
kube-system   etcd-km1                                  1/1     Running                 44              13m
kube-system   kube-apiserver-km1                        1/1     Running                 46              13m
kube-system   kube-controller-manager-km1               1/1     Running                 41              13m
kube-system   kube-proxy-c7m6b                          1/1     Running                 0               12m
kube-system   kube-proxy-sx4hj                          1/1     Running                 0               12m
kube-system   kube-scheduler-km1                        1/1     Running                 41              13m

I see calico-node-m4bsl is not working.

  1. is this a problem?
  2. am I doing something wrong to make this happen?

Here's some background if it helps you answer my delema:

I got my calico.yaml via:

$ curl -fLO https://raw.githubusercontent.com/projectcalico/calico/v3.25.0/manifests/calico.yaml

The only change I made to the file was uncommenting and setting the CALICO_IPV4POOL_CIDR variable:

4601     - name: CALICO_IPV4POOL_CIDR
4602       value: "10.2.0.0/16"

I initialized my cluster like this:

$ sudo kubeadm init --control-plane-endpoint=km1.lan:6443 --pod-network-cidr=10.2.0.0/16
$ kubectl describe pods -n kube-system calico-node-m4bsl
Name:                 calico-node-m4bsl
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Service Account:      calico-node
Node:                 kw1/192.168.56.60
Start Time:           Thu, 09 Mar 2023 13:45:09 -0600
Labels:               controller-revision-hash=9889897b6
                      k8s-app=calico-node
                      pod-template-generation=1
Annotations:          <none>
Status:               Pending
IP:                   192.168.56.60
IPs:
  IP:           192.168.56.60
Controlled By:  DaemonSet/calico-node
Init Containers:
  upgrade-ipam:
    Container ID:  containerd://49d885579623eb69e01288cbfbac8ee06e6a168819764fced9d4a83eba4443c7
    Image:         docker.io/calico/cni:v3.25.0
    Image ID:      docker.io/calico/cni@sha256:a38d53cb8688944eafede2f0eadc478b1b403cefeff7953da57fe9cd2d65e977
    Port:          <none>
    Host Port:     <none>
    Command:
      /opt/cni/bin/calico-ipam
      -upgrade
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 09 Mar 2023 13:45:10 -0600
      Finished:     Thu, 09 Mar 2023 13:45:10 -0600
    Ready:          True
    Restart Count:  0
    Environment Variables from:
      kubernetes-services-endpoint  ConfigMap  Optional: true
    Environment:
      KUBERNETES_NODE_NAME:        (v1:spec.nodeName)
      CALICO_NETWORKING_BACKEND:  <set to the key 'calico_backend' of config map 'calico-config'>  Optional: false
    Mounts:
      /host/opt/cni/bin from cni-bin-dir (rw)
      /var/lib/cni/networks from host-local-net-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-knm4l (ro)
  install-cni:
    Container ID:  containerd://91271557309b31affd3adc56c8d7ee57c560036d67f787b9e09645926a720b44
    Image:         docker.io/calico/cni:v3.25.0
    Image ID:      docker.io/calico/cni@sha256:a38d53cb8688944eafede2f0eadc478b1b403cefeff7953da57fe9cd2d65e977
    Port:          <none>
    Host Port:     <none>
    Command:
      /opt/cni/bin/install
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Thu, 09 Mar 2023 14:06:17 -0600
      Finished:     Thu, 09 Mar 2023 14:06:18 -0600
    Ready:          False
    Restart Count:  9
    Environment Variables from:
      kubernetes-services-endpoint  ConfigMap  Optional: true
    Environment:
      CNI_CONF_NAME:         10-calico.conflist
      CNI_NETWORK_CONFIG:    <set to the key 'cni_network_config' of config map 'calico-config'>  Optional: false
      KUBERNETES_NODE_NAME:   (v1:spec.nodeName)
      CNI_MTU:               <set to the key 'veth_mtu' of config map 'calico-config'>  Optional: false
      SLEEP:                 false
    Mounts:
      /host/etc/cni/net.d from cni-net-dir (rw)
      /host/opt/cni/bin from cni-bin-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-knm4l (ro)
  mount-bpffs:
    Container ID:
    Image:         docker.io/calico/node:v3.25.0
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      calico-node
      -init
      -best-effort
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /nodeproc from nodeproc (ro)
      /sys/fs from sys-fs (rw)
      /var/run/calico from var-run-calico (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-knm4l (ro)
Containers:
  calico-node:
    Container ID:
    Image:          docker.io/calico/node:v3.25.0
    Image ID:
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:      250m
    Liveness:   exec [/bin/calico-node -felix-live -bird-live] delay=10s timeout=10s period=10s #success=1 #failure=6
    Readiness:  exec [/bin/calico-node -felix-ready -bird-ready] delay=0s timeout=10s period=10s #success=1 #failure=3
    Environment Variables from:
      kubernetes-services-endpoint  ConfigMap  Optional: true
    Environment:
      DATASTORE_TYPE:                     kubernetes
      WAIT_FOR_DATASTORE:                 true
      NODENAME:                            (v1:spec.nodeName)
      CALICO_NETWORKING_BACKEND:          <set to the key 'calico_backend' of config map 'calico-config'>  Optional: false
      CLUSTER_TYPE:                       k8s,bgp
      IP:                                 autodetect
      CALICO_IPV4POOL_IPIP:               Always
      CALICO_IPV4POOL_VXLAN:              Never
      CALICO_IPV6POOL_VXLAN:              Never
      FELIX_IPINIPMTU:                    <set to the key 'veth_mtu' of config map 'calico-config'>  Optional: false
      FELIX_VXLANMTU:                     <set to the key 'veth_mtu' of config map 'calico-config'>  Optional: false
      FELIX_WIREGUARDMTU:                 <set to the key 'veth_mtu' of config map 'calico-config'>  Optional: false
      CALICO_IPV4POOL_CIDR:               10.100.0.0/16
      CALICO_DISABLE_FILE_LOGGING:        true
      FELIX_DEFAULTENDPOINTTOHOSTACTION:  ACCEPT
      FELIX_IPV6SUPPORT:                  false
      FELIX_HEALTHENABLED:                true
    Mounts:
      /host/etc/cni/net.d from cni-net-dir (rw)
      /lib/modules from lib-modules (ro)
      /run/xtables.lock from xtables-lock (rw)
      /sys/fs/bpf from bpffs (rw)
      /var/lib/calico from var-lib-calico (rw)
      /var/log/calico/cni from cni-log-dir (ro)
      /var/run/calico from var-run-calico (rw)
      /var/run/nodeagent from policysync (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-knm4l (ro)
Conditions:
  Type              Status
  Initialized       False
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  lib-modules:
    Type:          HostPath (bare host directory volume)
    Path:          /lib/modules
    HostPathType:
  var-run-calico:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/calico
    HostPathType:
  var-lib-calico:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/calico
    HostPathType:
  xtables-lock:
    Type:          HostPath (bare host directory volume)
    Path:          /run/xtables.lock
    HostPathType:  FileOrCreate
  sys-fs:
    Type:          HostPath (bare host directory volume)
    Path:          /sys/fs/
    HostPathType:  DirectoryOrCreate
  bpffs:
    Type:          HostPath (bare host directory volume)
    Path:          /sys/fs/bpf
    HostPathType:  Directory
  nodeproc:
    Type:          HostPath (bare host directory volume)
    Path:          /proc
    HostPathType:
  cni-bin-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /opt/cni/bin
    HostPathType:
  cni-net-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/cni/net.d
    HostPathType:
  cni-log-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log/calico/cni
    HostPathType:
  host-local-net-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/cni/networks
    HostPathType:
  policysync:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/nodeagent
    HostPathType:  DirectoryOrCreate
  kube-api-access-knm4l:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              kubernetes.io/os=linux
Tolerations:                 :NoSchedule op=Exists
                             :NoExecute op=Exists
                             CriticalAddonsOnly op=Exists
                             node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  23m                   default-scheduler  Successfully assigned kube-system/calico-node-m4bsl to kw1
  Normal   Pulled     23m                   kubelet            Container image "docker.io/calico/cni:v3.25.0" already present on machine
  Normal   Created    23m                   kubelet            Created container upgrade-ipam
  Normal   Started    23m                   kubelet            Started container upgrade-ipam
  Normal   Pulled     22m (x5 over 23m)     kubelet            Container image "docker.io/calico/cni:v3.25.0" already present on machine
  Normal   Created    22m (x5 over 23m)     kubelet            Created container install-cni
  Normal   Started    22m (x5 over 23m)     kubelet            Started container install-cni
  Warning  BackOff    3m50s (x94 over 23m)  kubelet            Back-off restarting failed container install-cni in pod calico-node-m4bsl_kube-system(80c1a06b-7522-4df6-8c5e-e7e1beb41cd0)
SYN avatar
hk flag
SYN
please share `kubectl logs -n kube-system -c install-cni -p calico-node-m4bsl` output (logs from that failing container, before it last restarted: should give us a clue why it's not completing successfully)
Veera Nagireddy avatar
nl flag
Hello @posop, Feel free to update the status of the question. Let me know the answer below helps to resolve your issue? I am happy to help you if you have any further queries.
Score:0
nl flag

Suspecting the root cause is that kubelet is launching multiple instances of portmap which prevent the install-cni container from copying that executable and completing the installation to the volume shared with calico-node. It appears to be because both the kubelet and calico are racing over access to the same executable file, i.e. /home/kubernetes/bin/portmap. Refer to Support hostPort for details.

As you described (calico container init failure), the calico-node pod is unable to recover. As a result, user workloads dependent on network policy also fail to start.

Modify kube-system's calico-node by setting UPDATE_CNI_BINARIES="false" in the daemonset YAML to include like below;

- env
 - name: UPDATE_CNI_BINARIES
   value: "false"

Also check you may be facing a temporary resource overload issue, due to an activity spike. Try changing the periodSeconds or timeoutSeconds to give your application enough time to respond.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.