Score:1

Why can I get on <1G on my 10G Network?

us flag

I have 4 CentOS 7 boxes with SuperMico 10000BaseT NICs plugged into a Netgear ProSafe XS712T switch with Cat8 cables. Switch is all default settings, but shows NICs at 10G Full. NICs are configured:

[root@VH11 ~]# ethtool ens1f0
Settings for ens1f0:
        Supported ports: [ TP ]
        Supported link modes:   100baseT/Full
                                1000baseT/Full
                                10000baseT/Full
        Supported pause frame use: Symmetric
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10000baseT/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: 10000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        MDI-X: Unknown
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: yes

There are ONLY 10G NICs plugged into the switch.

I can only get transfer speeds of less than 1G on file transfers as reported by rsync, scp and iftop when transfering 1 20G file. When I test from server > switch > server with iperf, it tells me it gets 9.38 Gbits/sec, but I only get 10% of that on file transfers with rsync or scp.

What am I doing wrong here?

Thanks in advance for your time.

Added Info: For 1GB Network Segment:

[root@VH14 ~]# time scp bigfile [email protected]:/home
[email protected]'s password:
bigfile                                       100% 4494MB 110.1MB/s   00:40

real    0m46.657s
user    0m18.975s
sys     0m4.646s

For 10GB Network Segment:

[root@VH14 ~]# time scp bigfile [email protected]:/home/bf3
[email protected]'s password:
bigfile                                       100% 4494MB 112.3MB/s   00:40

real    0m45.693s
user    0m34.643s
sys     0m8.440s

The 172. and 10. are on different switches. The 10G switch has no uplink and only communicates with servers. So, although iperf says I get about 10G, the transfer results are essentially the same on both subnets.

I don't think disk i/o is my problem:

[root@VH14 ~]# hdparm -t /dev/md126

/dev/md126:
 Timing buffered disk reads: 4150 MB in  3.00 seconds = 1382.80 MB/sec
[root@VH14 ~]# hdparm -T /dev/md126

/dev/md126:
 Timing cached reads:   19798 MB in  1.99 seconds = 9945.27 MB/sec

Further info: MTU on the 10G NICs is 9124. CPUs are Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz

Michael Hampton avatar
cz flag
Are your old boxes able to keep up with the encryption overhead?
jerryrig avatar
us flag
The servers have 56 cores and 256G RAM. Htop load average of .66 with 4 transfers running. I connected NIC to NIC, eliminating the switch, and got same results on file transfers.
Michael Hampton avatar
cz flag
So not old boxes, just old OS. Is it up to date? Have you tried an HTTP transfer? Are you aware of TCP slow start? How does the transfer speed vary throughout the download?
jerryrig avatar
us flag
Servers are fully updated. Don't have http installed though. Transfer speed is constant and first 10 seconds or so. I updated post with more info. BTW: CentOS7 is still alive. I'll be moving us to Oracle Linux 8 (RHEL8) before long.
Nikita Kipriyanov avatar
za flag
Did you enable Jumbo frames? What is interrupt load in the system (at least, what's in the "si" field of the top when you run tests)? Also, notice 56 cores don't say nothing about performance of each core; this CPU might be a huge collection of slow workers, each of them is unable to run fast enough, but together they can beat a giant. ssh, rsync are single threaded afaik, they don't benefit from multicore. E.g. try not 4, but 56 simultaneous transfers, to keep the CPU really busy.
Nikita Kipriyanov avatar
za flag
@vidarlo he mentioned that iperf says 10G
vidarlo avatar
ar flag
@NikitaKipriyanov Indeed! I missed it. Sorry!
jerryrig avatar
us flag
MTU on the 10G NICs is 9124. CPUs are Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
jerryrig avatar
us flag
Interesting, after creating an NFS mount between 2 servers with directly connected 10G NICS, iftop shows 7.6Gb with cp file to NFS command while with rsync shows 3.44 Gb. So at least I am getting somwhere with NFS. I believe there must be some tunable settings in the sysctl hierarchy that will fix this. I haven't found them yet, though.
Nikita Kipriyanov avatar
za flag
And again, what /proc/interrupts say during the transfer?
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.