Score:1

Server

Why can I get on <1G on my 10G Network?

jerryrig

1/2/23, 10:58 PM

I have 4 CentOS 7 boxes with SuperMico 10000BaseT NICs plugged into a Netgear ProSafe XS712T switch with Cat8 cables. Switch is all default settings, but shows NICs at 10G Full. NICs are configured:

[root@VH11 ~]# ethtool ens1f0
Settings for ens1f0:
        Supported ports: [ TP ]
        Supported link modes:   100baseT/Full
                                1000baseT/Full
                                10000baseT/Full
        Supported pause frame use: Symmetric
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10000baseT/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: 10000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        MDI-X: Unknown
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: yes

There are ONLY 10G NICs plugged into the switch.

I can only get transfer speeds of less than 1G on file transfers as reported by rsync, scp and iftop when transfering 1 20G file. When I test from server > switch > server with iperf, it tells me it gets 9.38 Gbits/sec, but I only get 10% of that on file transfers with rsync or scp.

What am I doing wrong here?

Thanks in advance for your time.

Added Info: For 1GB Network Segment:

[root@VH14 ~]# time scp bigfile root@172.16.75.9:/home
root@172.16.75.9's password:
bigfile                                       100% 4494MB 110.1MB/s   00:40

real    0m46.657s
user    0m18.975s
sys     0m4.646s

For 10GB Network Segment:

[root@VH14 ~]# time scp bigfile root@10.0.75.9:/home/bf3
root@10.0.75.9's password:
bigfile                                       100% 4494MB 112.3MB/s   00:40

real    0m45.693s
user    0m34.643s
sys     0m8.440s

The 172. and 10. are on different switches. The 10G switch has no uplink and only communicates with servers. So, although iperf says I get about 10G, the transfer results are essentially the same on both subnets.

I don't think disk i/o is my problem:

[root@VH14 ~]# hdparm -t /dev/md126

/dev/md126:
 Timing buffered disk reads: 4150 MB in  3.00 seconds = 1382.80 MB/sec
[root@VH14 ~]# hdparm -T /dev/md126

/dev/md126:
 Timing cached reads:   19798 MB in  1.99 seconds = 9945.27 MB/sec

Further info: MTU on the 10G NICs is 9124. CPUs are Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz

178

0 + 10

networking

10gbethernet

netgear

Michael Hampton

1/2/23, 11:12 PM

Are your old boxes able to keep up with the encryption overhead?

jerryrig

1/3/23, 12:49 AM

The servers have 56 cores and 256G RAM. Htop load average of .66 with 4 transfers running. I connected NIC to NIC, eliminating the switch, and got same results on file transfers.

Michael Hampton

1/3/23, 1:05 AM

So not old boxes, just old OS. Is it up to date? Have you tried an HTTP transfer? Are you aware of TCP slow start? How does the transfer speed vary throughout the download?

jerryrig

1/3/23, 1:54 AM

Servers are fully updated. Don't have http installed though. Transfer speed is constant and first 10 seconds or so. I updated post with more info. BTW: CentOS7 is still alive. I'll be moving us to Oracle Linux 8 (RHEL8) before long.

Nikita Kipriyanov

1/3/23, 5:39 AM

Did you enable Jumbo frames? What is interrupt load in the system (at least, what's in the "si" field of the top when you run tests)? Also, notice 56 cores don't say nothing about performance of each core; this CPU might be a huge collection of slow workers, each of them is unable to run fast enough, but together they can beat a giant. ssh, rsync are single threaded afaik, they don't benefit from multicore. E.g. try not 4, but 56 simultaneous transfers, to keep the CPU really busy.

Nikita Kipriyanov

1/3/23, 6:31 AM

@vidarlo he mentioned that iperf says 10G

vidarlo

1/3/23, 6:36 AM

@NikitaKipriyanov Indeed! I missed it. Sorry!

jerryrig

1/3/23, 1:50 PM

MTU on the 10G NICs is 9124. CPUs are Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz

jerryrig

1/3/23, 2:53 PM

Interesting, after creating an NFS mount between 2 servers with directly connected 10G NICS, iftop shows 7.6Gb with cp file to NFS command while with rsync shows 3.44 Gb. So at least I am getting somwhere with NFS. I believe there must be some tunable settings in the sysctl hierarchy that will fix this. I haven't found them yet, though.

Nikita Kipriyanov

1/6/23, 11:37 AM

And again, what /proc/interrupts say during the transfer?