Score:0

Can't ping PetaLinux machine if both NICs on same subnet and unplug eth0

es flag

I have a PetaLinux machine (Embedded Linux on a Xilinx Zynq device, debian fork, kernel 4.19). If the two NICs are on different subnets, then I can disconnect one NIC, and the other will continue working. But if they are on the same subnet, then disconnecting eth0 will render both unreachable. (Disconnecting eth1 is fine.) Also, if the addresses are acquired by DHCP, then disconnecting the plug for eth0 is also fine.

Now, I understand that by default, Linux has a weak policy for choosing a NIC to respond through for any message, and this is something I dealt with a few years ago, in another article.

Unfortunately, this solution doesn't seem to be working on the version of Linux we have here. Is there something new we have to do with more recent kernels?

Thanks in advance.

Update

"ip route" while both cables are plugged in:

default via 192.168.1.1 dev eth0
192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.195
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.196

"ip route" when eth0 is unplugged (failure case):

default via 192.168.1.1 dev eth0 linkdown
192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.195 linkdown
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.196

"ip route" when eth1 is unplugged (works fine):

default via 192.168.1.1 dev eth0
192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.195
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.196 linkdown

To reiterate, we are unable to ping 192.168.1.196 when the NIC for 192.168.1.195 is unplugged.

cn flag
What does the output `ip route` look like before and after you unplug the eth0 cable? (assuming we only care about IPv4 for now)
es flag
@hardillb Thanks. I went ahead and added that info to the question.
A.B avatar
cl flag
A.B
I'm sure there a more than one method to "fix" this (policy routing, bonding ...), but knowing why you willingly put two links using the same IP LAN in the same Ethernet LAN (aka broadcast domain) despite knowing there will be trouble would help. Is that for redundency, for bandwidth, for something else? Are there different services on the two IP addresses. As you link references, are your UDP applications able to use IP_PKTINFO properly or to bind to multiple different addresses instead of 0.0.0.0 (which is an other method to handle the reply source IP problem for UDP)? etc.
es flag
@A.B Thanks for the response. We ourselves only put both NICs on the same subnet for testing. Why our customers want to do it, we don't really know, but they insist on it. We're working on adding a feature that allows switching to bonding mode, but knowing the behavior patterns, they're going to insist that both bonding and non-bonding both work. We didn't know there would be trouble, and this doesn't seem like that edge of a case that it should so easily break. All the services are available from both NICs, and they don't serve to increase bandwidth.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.