Score:0

Fresh install of Rocky 8, networking is not working correctly

in flag

I am "upgrading" a server from CentOS 7 to Rocky 8. This server is a 1U Supermicro SYS-1029U-TRT, works as part of an HPC, and has two Ethernet and one Infiniband networking interfaces. One of the Ethernet interfaces is for the HPC, the other is used for the server room network and Internet access. After bringing up a VM copy of the CentOS server I started a fresh install of Rocky 8. I reused the previous partition table and mdadm RAID that was already configured and formatted each partition. After the install and initial set up of the networking interfaces the server is exceptionally slow when dealing with any network traffic through the "external" interface. This issue was never evident under CentOS and has multiple symptoms.

  • DNS queries do not get completed. This is seen especially well when running a ping on a host on the local network or attempting to download a file from the Internet or a local web server via curl or wget.
  • Pings to and from the server, using IP only, will either fail or start working after some, usually about 4, packets fail.
  • SSH connections to the server mostly fail with some attempts getting a password prompt but the login never completes.

I have attempted a number of troubleshooting steps with no apparent fix yet.

  • I verified that the IP settings, routing table, and resolv.conf were all correct.
  • I disconnected both of the HPC network interfaces. I also tried with the interfaces connected but deactivated, with no configuration, and with them connected and configured.
  • I verified that the Ethernet drivers were correct for the hardware. The system includes two 10Gbps Intel X540-AT2 interfaces which is using the kernel's ixgbe driver. I also downloaded and installed the latest version of Intel's driver.
  • I verified that the switch port is properly configured, including the VLAN and MTU settings.
  • I tested the other two interfaces via ping both to and from the server and both showed no issues.
  • I disconnected the interface from the usual switch and used a new cable to connect it to a nearby switch on the same VLAN.

None of those steps changed a single thing. I'm out of ideas and am looking for further possible reasons for why this is occurring. If any information is needed I'll gladly add it as requested.

A previous issue that was not reported when CentOS 7 was installed is that sometimes a SSH connection would "pause" for up to a minute before being usable again. This is similar to the current issues to make me think this is a hardware issue.

Here are some ip command outputs, ip a and ip route, to show how things are configured. Also when configuring in nmtui I enabled the "Never use this network for default route", "Ignore automatically obtained routes", and "Ignore automatically obtained DNS parameters" settings on the eno2 and ib0 connections. None of these settings are enabled on the eno1 connection.

[root@hostname ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether ac:1f:6b:c9:b3:6e brd ff:ff:ff:ff:ff:ff
    altname enp24s0f0
    inet 10.0.21.150/22 brd 10.0.23.255 scope global noprefixroute eno1
       valid_lft forever preferred_lft forever
3: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether ac:1f:6b:c9:b3:6f brd ff:ff:ff:ff:ff:ff
    altname enp24s0f1
    inet 10.33.0.110/22 brd 10.33.3.255 scope global noprefixroute eno2
       valid_lft forever preferred_lft forever
4: ib0: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4092 qdisc mq state UP group default qlen 256
    link/infiniband 00:00:01:20:fe:80:00:00:00:00:00:00:0c:42:a1:03:00:c0:af:08 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
    inet 10.33.4.110/22 brd 10.33.7.255 scope global noprefixroute ib0
       valid_lft forever preferred_lft forever
[root@hostname ~]# ip route
default via 10.0.20.1 dev eno1 proto static metric 100
10.0.20.0/22 dev eno1 proto kernel scope link src 10.0.21.150 metric 100
10.33.0.0/22 dev eno2 proto kernel scope link src 10.33.0.110 metric 101
10.33.4.0/22 dev ib0 proto kernel scope link src 10.33.4.110 metric 150

Edit1: Added more info, the CentOS issue.
Edit2: Added requested ip command outputs and some nmtui settings.

vidarlo avatar
ar flag
What's the output of `ip route` and `ip a`?
Score:0
in flag

This was found to be a true networking issue, conflicting MAC addresses.
When I created the VM copy I duplicate the MAC address of the interface from the hardware with the intention to change it after the VM was verified as working. Then I forgot to change it.

Removing the duplicated MAC address from the VM and allowing it to be randomized solved the issue.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.