we are using ConnectX-5 100GbE ethernet cards on our servers which is connected one to each other trough the mellanox switch. And we are using weavenet cni plugin on our Kubernetes cluster. When we make some tests using iperf tool with the following command we get the 100Gbps connection speed in the host.
# server host
host1 $ iperf -s -P8
# client host
host2 $ iperf -c <host_ip> -P8
Result: 98.8 Gbps transfer speed
Also when we make some tests with the same tool and command using two docker containers on the same hosts we also get the same results.
# server host
host1$ docker run -it -p 5001:5001 ubuntu:latest-with-iperf iperf -s -P8
# client host
host2 $ docker run -it -p 5001:5001 ubuntu:latest-with-iperf iperf -c <host_ip> -P8
Result: 98.8 Gbps transfer speed
But the when we create two diffrent deployment in the same hosts(host1,host2) with the same images and make the same test trough the service ip(we created a k8s service using the following yaml) which redirects traffic into the server pod we get the only 2Gbps. We also make the same test using the pod's cluster ip and the service's cluster domain but the results are same.
kubectl create deployment iperf-server --image=ubuntu:latest-with-iperf # after that we add affinity(host1) and container port sections to the yaml
kubectl create deployment iperf-client --image=ubuntu:latest-with-iperf # after that we add affinity(host2) and container port sections to the yaml
kind: Service
apiVersion: v1
metadata:
name: iperf-server
namespace: default
spec:
ports:
- name: iperf
protocol: TCP
port: 5001
targetPort: 5001
selector:
name: iperf-server
clusterIP: 10.104.10.230
type: ClusterIP
sessionAffinity: None
TLDR;
The scenarios we tested:
- host1(ubuntu 20.04, mellanox driver installed) <--------> host2(ubuntu 20.04, mellanox driver installed) = 98.8 Gbps
- container1-on-host1 <--------> container2-on-host2 = 98.8 Gbps
- Pod1-on-host1 <-------> Pod2-on-host2 (using cluster ip) = 2Gbps
- Pod1-on-host1 <-------> Pod2-on-host2 (using service cluster ip) = 2Gbps
- Pod1-on-host1 <-------> Pod2-on-host2 (using service cluster domain) = 2Gbps
We need to get the 100Gbps speed on pod-to-pod communication. So what could be causing this issue?
Update1:
- When I check the htop inside pods during the iperf test there are 112 cpu core and none of them are struggling with CPU.
- When I add the
hostNetwork: true
key to the deployments pods can reach up to 100Gbps bandwith.