Score:0

K8s nodeLocalDns pod times out connecting to coreDns after upgrading base os to ubuntu20.04 ConnectX-4 card

gr flag

Team,

I have Mellanox Nic ConnectX-4 on a k8s worker node and it hosts a nodeLocal dns pod on it. The nodeLocalDns pod is timing out when trying to connect to coreDns service on k8s cluster.

Same works on Ubuntu18.

Versions failing with

k8s v1.13.5 Baremetal
Ubuntu 20.04.4 LTS   
kernel 5.4.0-100-generic   
docker://19.3.13

below works well.

 k8s v1.13.5 Baremetal
 Ubuntu 18.04.2 LTS   
 kernel  4.15.0-45-generic   
 docker://18.9.2

Any hint how can I debug this? I am getting no clue in logs.

Errors are from nodeLocalDNS pod logs.

A: dial tcp 100.60.3.4:53: i/o timeout

Where above is coreDns service and it is pingable from nodeLocalDns pod but not connecting on dns port.

kkopczak avatar
ng flag
Which version of Kubernetes did you use and how did you set up the cluster? Did you use bare metal installation or some cloud provider?
AhmFM avatar
gr flag
added in description `Baremetal k8s 1.13.5`
Score:0
gr flag

It was a interoperability issue that we fixed disabling checksum off on the NIC of the node. after below command, pod networking started working. this was only with mellanix ConnectX-4. same was not observed with ConnectX-5

ethtool -K ens1 rx on tx off
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.