We are running our applications in Kubernetes(1.11) cluster installed through KOps.(its our DEV/QA cluster inherited from the employee who is no longer with the company)
Mostly everything works fine but sometimes after deployments,
some of the pods give connection refused errors.We came to know because Nginx was complaining 502 error from backend.
Sometimes it will automatically work again, but again it will start giving errors. Restarting the pod will resolve the issue.
It will work fine til the next deployment, then the issue happens again.
We compared the syslog with other cluster but everything looks similar.
TCPDUMPS logs of the POD's IP
11:06:47.387766 IP 100.96.13.22.57778 > 100.96.12.137.http-alt: Flags [S], seq 1515889791, win 26883, options [mss 8961,sackOK,TS val 132113284 ecr 0,nop,wscale 9], length 0
11:06:47.387775 IP 100.96.13.22.57778 > 100.96.12.137.http-alt: Flags [S], seq 1515889791, win 26883, options [mss 8961,sackOK,TS val 132113284 ecr 0,nop,wscale 9], length 0
11:06:47.387777 IP 100.96.13.22.57778 > 100.96.12.137.http-alt: Flags [S], seq 1515889791, win 26883, options [mss 8961,sackOK,TS val 132113284 ecr 0,nop,wscale 9], length 0
11:06:47.387781 IP 100.96.12.137.http-alt > 100.96.13.22.57778: Flags [R.], seq 0, ack 1515889792, win 0, length 0
11:06:47.387781 IP 100.96.12.137.http-alt > 100.96.13.22.57778: Flags [R.], seq 0, ack 1, win 0, length 0
11:06:47.387785 IP 100.96.12.137.http-alt > 100.96.13.22.57778: Flags [R.], seq 0, ack 1, win 0, length 0
As seen in the logs, Ingress-nginx(100.96.13.22) pod tries to connect to webapp pod(100.96.12.137) but the connection to the pods are immedietly reset.
Our Investigation:
After some learning about how kubernetes network work (Bridge networking, VETH pairs),
(https://medium.com/practo-engineering/networking-with-kubernetes-1-3db116ad3c98
https://stackoverflow.com/questions/37860936/find-out-which-network-interface-belongs-to-docker-container
https://www.digitalocean.com/community/tutorials/how-to-inspect-kubernetes-networking#finding-and-entering-pod-network-namespaces
)
As per this, the pods interface is connected through a VETH pair to the Nodes bridge interface
Any traffic from or to the pod goes through this bridge(in our case cbr0)
Troubleshooting:
We got the affected pods container ID by running
docker ps
Get the Pod's Process ID
docker inspect --format '{{ .State.Pid }}' container-ID
Get the Pods Network details
nsenter -t container-pid -n ip addr
output:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
3: eth0@if380852: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc noqueue state UP group default
link/ether 0a:58:64:60:0d:27 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 100.96.13.39/24 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::e4cd:81ff:fe96:2914/64 scope link
valid_lft forever preferred_lft forever
eth0@if380852 is the pods network interface
380852 is the VETH link number
0a:58:64:60:0d:27 is the pods mac address
Get the pod's VETH pair details
ip addr | grep 380852
output:380852: vethd49cda8b@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc noqueue master cbr0 state UP group default
Here vethd49cda8b is the pods VETH ID
Now checked the bridge details
Get the bridge mac table:
brctl showmacs cbr0 | grep 0a:58:64:60:0d:27
output:27 0a:58:64:60:0d:27 no 1.44
Here 27 is PORT of VETH Interface
Checked the Port's VETH details
brctl showstp cbr0 | grep "(27)"
output:
veth488082e8 (27)
We can see that the Port 27 belongs to different VETH interface, The expected output should be:
POD's VETH ID (PORT in the bridge table)
vethd49cda8b (27)
Lets get the POD's VETH interface's actual PORT
brctl showstp cbr0 | grep vethd49cda8b
output:
vethd49cda8b (52)
We can see that the traffic to the pods is getting lost due the wrong port in bridge mac table
The bridge MAC table should show the port as 52 for the container MAC address, but it is showing 27
But after some time, it automatically shows the correct port, which resolves our connection issue.
Is it possible to know what updates the bridge table? and how to avoid incorrect updates?
How can we troubleshoot it further?
Has anyone faced similar issues.
Thanks in advance, really appreciate any help.