I used kubeadm to deploy a bare-metal cluster with one control plane node and one worker node on the same LAN. After initializing the cluster (kubeadm init
on the cp and kubeadm join
on the worker node), I installed calico via helm. The calico-node and calico-kube-controllers pods do not reach ready state. However, they seem to be functioning correctly, and if I manually call the commands that the liveness and readiness probes execute, I get the expected success response. I may have a calico-specific problem, but my immediate question is what could cause this behavior with the readiness probes?
The output of kubectl describe pod -n calico-system calico-node-xxxx
:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 5s (x7 over 43s) kubelet Readiness probe errored: rpc error: code = Unknown desc = command error: EOF, stdout: , stderr: , exit code -1
The probe configuration in the calico-node-xxxx pods' yaml:
readinessProbe:
exec:
command:
- /bin/calico-node
- -felix-ready
failureThreshold: 3
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
livenessProbe:
failureThreshold: 3
httpGet:
host: localhost
path: /liveness
port: 9099
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 10
When I try kubectl exec -n calico-system calico-node-xxxx -- /bin/calico-node -felix-ready && echo "$?"
, I can see that the exit code is 0, a success. Likewise, curl localhost:9099/liveness
it gets a 200 code and the expected response. This is true even if I execute the commands within a second of creating the pods, so I doubt it has to do with the failureThreshold or timeoutSeconds etc. My understanding of how the exec
command actually gets called for the readiness probes is shaky, so maybe an explanation of how it could differ from kubectl exec
would point me in the right direction?
Thanks.