One of my server's physical interfaces eno3 is part of a bridge br0. There is also one LXD machine attached to this bridge (nictype=bridged, parent=br0).
Sometimes ARP stops functioning completely.
If I look at a capture of traffic on the bridge br0, I see a lot of ARP requests from the server and VM, but no responses at all.
This only affects IPv4.
Restarting the switch attached to eno3 doesn't help.
Bringing br0 down and up again with 'ip link' doesn't help.
Bringing eno3 down and up again with 'ip link' does fix the issue and everything works after that!
$ uname -a
Linux xxx 5.4.0-90-generic #101-Ubuntu SMP Fri Oct 15 20:00:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
$ lspci -nn
05:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
06:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
07:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
The nics use driver igb.
The following is command output when system is in a failed state.
# ip -br link
lo UNKNOWN 00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP>
eno1 UP 0c:c4:7a:7b:8f:28 <BROADCAST,MULTICAST,UP,LOWER_UP>
eno2 DOWN 0c:c4:7a:7b:8f:29 <BROADCAST,MULTICAST>
eno3 UP 0c:c4:7a:7b:8f:2a <BROADCAST,MULTICAST,UP,LOWER_UP>
br0 UP 0c:c4:7a:7b:8f:2a <BROADCAST,MULTICAST,UP,LOWER_UP>
sec@eno3 UP 0c:c4:7a:7b:8f:2a <BROADCAST,MULTICAST,UP,LOWER_UP>
guest@eno3 UP 0c:c4:7a:7b:8f:2a <BROADCAST,MULTICAST,UP,LOWER_UP>
iot@eno3 UP 0c:c4:7a:7b:8f:2a <BROADCAST,MULTICAST,UP,LOWER_UP>
veth4961922d@if10 UP ce:1a:33:2f:83:e4 <BROADCAST,MULTICAST,UP,LOWER_UP>
vpnnet0@if12 UP 56:e8:ff:f8:fe:0b <BROADCAST,MULTICAST,UP,LOWER_UP>
# ip -4 -br address
lo UNKNOWN 127.0.0.1/8
eno1 UP 192.168.178.2/24
br0 UP 192.168.1.1/24
sec@eno3 UP 192.168.30.1/24
guest@eno3 UP 192.168.10.1/24
iot@eno3 UP 192.168.20.1/24
vpnnet0@if12 UP 192.168.2.1/24
# ip route
default via 192.168.178.1 dev eno1 proto static
192.168.1.0/24 dev br0 proto kernel scope link src 192.168.1.1
192.168.2.0/24 dev vpnnet0 proto kernel scope link src 192.168.2.1
192.168.10.0/24 dev guest proto kernel scope link src 192.168.10.1
192.168.20.0/24 dev iot proto kernel scope link src 192.168.20.1
192.168.30.0/24 dev sec proto kernel scope link src 192.168.30.1
192.168.50.0/24 via 192.168.2.2 dev vpnnet0 proto static
192.168.178.0/24 dev eno1 proto kernel scope link src 192.168.178.2
# ip -4 neigh
192.168.1.125 dev br0 FAILED
192.168.1.148 dev br0 FAILED
192.168.10.171 dev guest FAILED
192.168.178.227 dev eno1 FAILED
192.168.1.95 dev br0 lladdr 74:ac:b9:66:a9:7b STALE
192.168.20.106 dev iot lladdr b8:27:eb:50:68:5f STALE
192.168.1.6 dev br0 lladdr 00:16:3e:92:18:81 DELAY
192.168.178.42 dev eno1 FAILED
192.168.10.181 dev guest FAILED
192.168.1.94 dev br0 lladdr d8:07:b6:88:42:1e STALE
192.168.178.23 dev eno1 FAILED
192.168.1.127 dev br0 FAILED
192.168.1.150 dev br0 FAILED
192.168.1.101 dev br0 FAILED
192.168.20.50 dev iot FAILED
192.168.10.100 dev guest lladdr 80:2a:a8:99:83:a0 STALE
192.168.1.212 dev br0 FAILED
192.168.10.157 dev guest lladdr 74:ac:b9:66:a9:7b STALE
192.168.1.245 dev br0 FAILED
192.168.30.10 dev sec INCOMPLETE
192.168.20.51 dev iot FAILED
192.168.178.27 dev eno1 FAILED
192.168.10.111 dev guest lladdr 78:8a:20:4b:9a:c8 STALE
192.168.178.17 dev eno1 FAILED
192.168.1.70 dev br0 FAILED
192.168.1.226 dev br0 FAILED
192.168.1.9 dev br0 INCOMPLETE
192.168.1.2 dev br0 FAILED
192.168.1.108 dev br0 lladdr 00:17:c8:b6:e7:85 STALE
192.168.1.8 dev br0 FAILED
192.168.10.140 dev guest FAILED
192.168.1.97 dev br0 lladdr 14:59:c0:55:18:0a STALE
192.168.1.111 dev br0 FAILED
192.168.1.96 dev br0 FAILED
192.168.1.211 dev br0 FAILED
192.168.1.10 dev br0 FAILED
192.168.1.99 dev br0 lladdr 80:2a:a8:99:83:a0 STALE
192.168.10.173 dev guest FAILED
192.168.20.111 dev iot FAILED
192.168.1.5 dev br0 FAILED
192.168.2.2 dev vpnnet0 lladdr 00:16:3e:7c:69:e0 REACHABLE
192.168.1.60 dev br0 FAILED
192.168.1.161 dev br0 FAILED
192.168.178.1 dev eno1 lladdr 48:d3:43:a0:58:58 REACHABLE
192.168.1.219 dev br0 FAILED
192.168.1.119 dev br0 FAILED
Any idea as to what is going on?
EDIT: Updated with command output when system is in failed state.
Thanks in advance!