Inter-pod communication failure between Kubernetes nodes : Azure virtual machine and on-prem node

femtonelson

9/13/23, 11:23 AM

Rancher Server Setup

Rancher version: 2.6.3
Installation option (Docker install/Helm Chart): Helm Chart, Kubernetes v1.21.6 and RKE1

Information about the Cluster Kubernetes version: v1.20.15-rancher1-2 Cluster Type (Local/Downstream): Downstream If downstream, what type of cluster? (Custom/Imported or specify provider for Hosted/Infrastructure Provider): RKE Custom (3 nodes on-prem + 1 node on Azure)

User Information What is the role of the user logged in? (Admin/Cluster Owner/Cluster Member/Project Owner/Project Member/Custom) Admin role

Describe the bug
To illustrate the inter-pod communication problem, consider these three dcgm-exporter pods that collect and expose GPU metrics :

URL1- http://10.42.0.79:9400/metrics -> Pod 10.42.4.54 running on node-1-on-prem
URL2- http://10.42.2.77:9400/metrics -> Pod 10.42.2.77 running on node-2-on-prem
URL3- http://10.42.4.54:9400/metrics -> Pod 10.42.4.54 running on node-3-azure
On node-1-on-prem Linux shell : curl URL1 & URL2 are successful; curl URL3 fails
On node-2-on-prem Linux shell : curl URL1 & URL2 are successful; curl URL3 fails
On node-3-azure Linux shell : curl URL1 & URL2 fail ; curl URL3 is successful

Reproduce

On-prem subnet is 10.133.100.0/24 and Azure subnet is 10.208.2.0/24
Azure Virtual network and Local network are connected by a site to site VPN
Node to node connections are successful and there are no port restrictions in Azure and on-prem
IPv4 port forwarding enabled on all nodes
Downstream cluster container network interface configuration : network: mtu: 0 options: flannel_backend_type: vxlan plugin: canal
Azure node addition to cluster is flawless and all pods come up

Result

On node-1-on-prem Linux shell : $curl http://10.42.4.54:9400/metrics curl: (28) Failed to connect to 10.42.4.54 port 9400: Connection timed out

Expected Result

Successful inter-pod communication and display of GPU metrics

How to get these pods to communicate properly? Thanks in advance for your support.

0 + 0

azure

docker

kubernetes

rancher

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: Inter-pod communication failure between Kubernetes nodes : Azure virtual machine and on-prem node

TH: ความล้มเหลวในการสื่อสารระหว่างพ็อดระหว่างโหนด Kubernetes: เครื่องเสมือน Azure และโหนดภายในองค์กร

RO: Eroare de comunicare între pod între nodurile Kubernetes: mașină virtuală Azure și nodul local

RU: Сбой связи между модулями Kubernetes: виртуальная машина Azure и локальный узел

VI: Lỗi giao tiếp giữa các nhóm giữa các nút Kubernetes : Máy ảo Azure và nút tại chỗ

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.