I installed a k8s 1.23.3 cluster on four raspberry pi's running raspberrypi OS 11 (bullseye) arm64; mostly by following this guide.
The gist of it is that the control plane was created using this command
kubeadm init --token={some_token} --kubernetes-version=v1.23.3 --pod-network-cidr=10.1.0.0/16 --service-cidr=10.11.0.0/16 --control-plane-endpoint=10.0.4.16 --node-name=rpi-1-1
I then created my own kube-verify
namespace, put a deployment of the echo-server into it, and created a service for it.
However, I cannot reach the service's cluster IP from any of the nodes. Why? Requests simply time out, while requests to the pod's cluster IP work fine.
I suspect my kube-proxy
is not working as it should. Below is what I investigated so far.
$ kubectl get services -n kube-verify -o=wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
echo-server ClusterIP 10.11.213.180 <none> 8080/TCP 24h app=echo-server
$ kubectl get pods -n kube-system -o=wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-64897985d-47gpr 1/1 Running 1 (69m ago) 41h 10.1.0.5 rpi-1-1 <none> <none>
coredns-64897985d-nf55w 1/1 Running 1 (69m ago) 41h 10.1.0.4 rpi-1-1 <none> <none>
etcd-rpi-1-1 1/1 Running 2 (69m ago) 41h 10.0.4.16 rpi-1-1 <none> <none>
kube-apiserver-rpi-1-1 1/1 Running 2 (69m ago) 41h 10.0.4.16 rpi-1-1 <none> <none>
kube-controller-manager-rpi-1-1 1/1 Running 2 (69m ago) 41h 10.0.4.16 rpi-1-1 <none> <none>
kube-flannel-ds-5467m 1/1 Running 1 (69m ago) 28h 10.0.4.17 rpi-1-2 <none> <none>
kube-flannel-ds-7wpvz 1/1 Running 1 (69m ago) 28h 10.0.4.18 rpi-1-3 <none> <none>
kube-flannel-ds-9chxk 1/1 Running 1 (69m ago) 28h 10.0.4.19 rpi-1-4 <none> <none>
kube-flannel-ds-x5rvx 1/1 Running 1 (69m ago) 29h 10.0.4.16 rpi-1-1 <none> <none>
kube-proxy-8bbjn 1/1 Running 1 (69m ago) 28h 10.0.4.17 rpi-1-2 <none> <none>
kube-proxy-dw45d 1/1 Running 1 (69m ago) 28h 10.0.4.18 rpi-1-3 <none> <none>
kube-proxy-gkkxq 1/1 Running 2 (69m ago) 41h 10.0.4.16 rpi-1-1 <none> <none>
kube-proxy-ntl5w 1/1 Running 1 (69m ago) 28h 10.0.4.19 rpi-1-4 <none> <none>
kube-scheduler-rpi-1-1 1/1 Running 2 (69m ago) 41h 10.0.4.16 rpi-1-1 <none> <none>
$ kubectl logs kube-proxy-gkkxq -n kube-system
I0220 13:52:02.281289 1 node.go:163] Successfully retrieved node IP: 10.0.4.16
I0220 13:52:02.281535 1 server_others.go:138] "Detected node IP" address="10.0.4.16"
I0220 13:52:02.281610 1 server_others.go:561] "Unknown proxy mode, assuming iptables proxy" proxyMode=""
I0220 13:52:02.604880 1 server_others.go:206] "Using iptables Proxier"
I0220 13:52:02.604966 1 server_others.go:213] "kube-proxy running in dual-stack mode" ipFamily=IPv4
I0220 13:52:02.605026 1 server_others.go:214] "Creating dualStackProxier for iptables"
I0220 13:52:02.605151 1 server_others.go:491] "Detect-local-mode set to ClusterCIDR, but no IPv6 cluster CIDR defined, , defaulting to no-op detect-local for IPv6"
I0220 13:52:02.606905 1 server.go:656] "Version info" version="v1.23.3"
W0220 13:52:02.614777 1 sysinfo.go:203] Nodes topology is not available, providing CPU topology
I0220 13:52:02.619535 1 conntrack.go:52] "Setting nf_conntrack_max" nf_conntrack_max=131072
I0220 13:52:02.620869 1 conntrack.go:100] "Set sysctl" entry="net/netfilter/nf_conntrack_tcp_timeout_close_wait" value=3600
I0220 13:52:02.660947 1 config.go:317] "Starting service config controller"
I0220 13:52:02.661015 1 shared_informer.go:240] Waiting for caches to sync for service config
I0220 13:52:02.662669 1 config.go:226] "Starting endpoint slice config controller"
I0220 13:52:02.662726 1 shared_informer.go:240] Waiting for caches to sync for endpoint slice config
I0220 13:52:02.762734 1 shared_informer.go:247] Caches are synced for service config
I0220 13:52:02.762834 1 shared_informer.go:247] Caches are synced for endpoint slice config
What I'm noticing here is that the Nodes topology is not available
, so I dug into the kube-proxy config some more, but nothing stands out to me.
If there is indeed an issue with the nodes topology in my cluster, please direct me towards some resources on how to troubleshoot this, as I could not find anything meaningful based on this error message.
$ kubectl describe configmap kube-proxy -n kube-system
Name: kube-proxy
Namespace: kube-system
Labels: app=kube-proxy
Annotations: kubeadm.kubernetes.io/component-config.hash: sha256:edce433d45f2ed3a58ee400690184ad033594e8275fdbf52e9c8c852caa7124d
Data
====
config.conf:
----
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
bindAddressHardFail: false
clientConnection:
acceptContentTypes: ""
burst: 0
contentType: ""
kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
qps: 0
clusterCIDR: 10.1.0.0/16
configSyncPeriod: 0s
conntrack:
maxPerCore: null
min: null
tcpCloseWaitTimeout: null
tcpEstablishedTimeout: null
detectLocalMode: ""
enableProfiling: false
healthzBindAddress: ""
hostnameOverride: ""
iptables:
masqueradeAll: false
masqueradeBit: null
minSyncPeriod: 0s
syncPeriod: 0s
ipvs:
excludeCIDRs: null
minSyncPeriod: 0s
scheduler: ""
strictARP: false
syncPeriod: 0s
tcpFinTimeout: 0s
tcpTimeout: 0s
udpTimeout: 0s
kind: KubeProxyConfiguration
metricsBindAddress: ""
mode: ""
nodePortAddresses: null
oomScoreAdj: null
portRange: ""
showHiddenMetricsForVersion: ""
udpIdleTimeout: 0s
winkernel:
enableDSR: false
networkName: ""
sourceVip: ""
kubeconfig.conf:
----
apiVersion: v1
kind: Config
clusters:
- cluster:
certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
server: https://10.0.4.16:6443
name: default
contexts:
- context:
cluster: default
namespace: default
user: default
name: default
current-context: default
users:
- name: default
user:
tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
BinaryData
====
Events: <none>
$ kubectl -n kube-system exec kube-proxy-gkkxq cat /var/lib/kube-proxy/kubeconfig.conf
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
apiVersion: v1
kind: Config
clusters:
- cluster:
certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
server: https://10.0.4.16:6443
name: default
contexts:
- context:
cluster: default
namespace: default
user: default
name: default
current-context: default
users:
- name: default
user:
tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
The mode
defaults to iptables
, as the logs above confirm.
I also have IP forwarding enabled on all nodes.
$ sudo sysctl net.ipv4.ip_forward
net.ipv4.ip_forward = 1