Score:1

Unexpected linux neiboring subsystem behaviour when work with bridge

gw flag

This is a question about linux neighboring subsystem and virtual bridge(linux is new to me,sorry for that).
The test is performed on UBUNTU21(uname -a):
Linux dlw 5.11.0-16-generic #17-Ubuntu SMP Wed Apr 14 20:12:43 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Initial Config

The Topology diagram:

        ||=============================================||  
        ||---------------|                      |------||  
PC2 --- ||enx00e04c369b80|                      |wlp1s0||--- PC4    
        ||---------------|                      |------|| 
        ||                  PC1(UBUNTU 21)             ||       
        ||---------------|                             ||
PC3 --- ||enx8800669997d7|                             ||
        ||---------------|                             ||
        ||=============================================||

All cards in PC1 are configured with static ip by following commands:
ifconfig enx00e04c369b80 192.168.0.31 netmask 255.255.255.0
ifconfig enx8800669997d7 192.168.0.32 netmask 255.255.255.0
ifconfig wlp1s0 192.168.0.33 netmask 255.255.255.0
and “ip_forward” is disabled by checking /proc/sys/net/ipv4/ip_forward:
cat /proc/sys/net/ipv4/ip_forward
0

PC2/PC3/PC4 are terminals with static IP and firewall is disabled.

In brief, relation between PC1 and PCx is:

---------------------------------------------
PC1-interfaces  static ip       direct-linked-external-device-per-port
--------------- -----------     -----------------------------------------
enx00e04c369b80 192.168.0.31    PC2 192.168.0.17        40:8d:5c:21:db:57
enx8800669997d7 192.168.0.32    PC3 192.168.0.10        50:3e:aa:05:64:f7
wlp1s0          192.168.0.33    PC4 192.168.0.254       00:85:00:07:AA:3A

After bring them up with command “ifconfig xxx up”, three cards in PC1 are separated in three subnet and can only communicate with direct-linked PCx(verified by “ping” with -I option). Specially, PC2 and PC3 and PC4 can not reach each other.

Purpose & operation

Then I want to connect PC2 and PC3 by virtual bridge, that is, add a bridge “br0” in PC1, and add interfaces enx00e04c369b80 and enx8800669997d7 to it, and wlp1s0 remain not changed.

This is reached by:

brctl addbr br0
brctl addif br0 enx00e04c369b80
brctl addif br0 enx8800669997d7
ifconfig enx8800669997d7 0.0.0.0
ifconfig enx00e04c369b80 0.0.0.0
ifconfig br0 192.168.0.30 netmask 255.255.255.0
ifconfig br0 up

this resulting in the following configuration in PC1:

root@dlw:/home/dlw# ifconfig
br0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.0.30  netmask 255.255.255.0  broadcast 192.168.0.255
        inet6 fe80::a4d0:5bff:fe4a:8f76  prefixlen 64  scopeid 0x20<link>
        ether a6:d0:5b:4a:8f:76  txqueuelen 1000  (Ethernet)
        RX packets 440  bytes 20303 (20.3 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 41  bytes 5530 (5.5 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enx00e04c369b80: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        ether 00:e0:4c:36:9b:80  txqueuelen 1000  (Ethernet)
        RX packets 684  bytes 34203 (34.2 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 100  bytes 8901 (8.9 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enx8800669997d7: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        ether 88:00:66:99:97:d7  txqueuelen 1000  (Ethernet)
        RX packets 298  bytes 35266 (35.2 KB)
        RX errors 0  dropped 4  overruns 0  frame 0
        TX packets 487  bytes 32918 (32.9 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 5235  bytes 423013 (423.0 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 5235  bytes 423013 (423.0 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

wlp1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.0.33  netmask 255.255.255.0  broadcast 192.168.0.255
        inet6 fe80::ea0b:12c0:2ebf:b5c6  prefixlen 64  scopeid 0x20<link>
        ether 84:5c:f3:52:98:60  txqueuelen 1000  (Ethernet)
        RX packets 23040  bytes 22183801 (22.1 MB)
        RX errors 0  dropped 53  overruns 0  frame 0
        TX packets 8987  bytes 994399 (994.3 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
root@dlw:/home/dlw# brctl show
bridge name bridge id       STP enabled interfaces
br0     8000.a6d05b4a8f76   no      enx00e04c369b80
                            enx8800669997d7
root@dlw:/home/dlw# ls /sys/class/net/br0/brif/
enx00e04c369b80  enx8800669997d7

The bridge br0 is worked normally by “ping” each other in PC2 and PC3.

Question

But my question is, PC2 can reach PC1’s wlp1s0(by ping),

ping 192.168.0.33
PING 192.168.0.33 (192.168.0.33) 56(84) bytes of data.
64 bytes from 192.168.0.33: icmp_seq=1 ttl=64 time=1.05 ms
64 bytes from 192.168.0.33: icmp_seq=2 ttl=64 time=1.05 ms

however, the real “responser” is br0(check mac address) but not wlp1s0.This is verified by packet captured in PC2 using wireshark:

23  2021-09-30 10:56:17.231255  Tp-LinkT_05:64:f7   Broadcast   ARP 42      Who has 192.168.0.33? Tell 192.168.0.10
0000   ff ff ff ff ff ff 50 3e aa 05 64 f7 08 06 00 01
0010   08 00 06 04 00 01 50 3e aa 05 64 f7 c0 a8 00 0a
0020   00 00 00 00 00 00 c0 a8 00 21
24  2021-09-30 10:56:17.231999  a6:d0:5b:4a:8f:76   Tp-LinkT_05:64:f7   ARP 60      192.168.0.33 is at a6:d0:5b:4a:8f:76
0000   50 3e aa 05 64 f7 a6 d0 5b 4a 8f 76 08 06 00 01
0010   08 00 06 04 00 02 a6 d0 5b 4a 8f 76 c0 a8 00 21
0020   50 3e aa 05 64 f7 c0 a8 00 0a 55 55 55 55 55 55
0030   55 55 55 55 55 55 55 55 55 55 55 55
25  2021-09-30 10:56:17.232010  192.168.0.10    192.168.0.33    ICMP    74  0xc634 (50740)  Echo (ping) request  id=0x0002, seq=20474/64079, ttl=128 (reply in 26)
0000   a6 d0 5b 4a 8f 76 50 3e aa 05 64 f7 08 00 45 00
0010   00 3c c6 34 00 00 80 01 00 00 c0 a8 00 0a c0 a8
0020   00 21 08 00 fd 5f 00 02 4f fa 61 62 63 64 65 66
0030   67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76
0040   77 61 62 63 64 65 66 67 68 69
26  2021-09-30 10:56:17.232711  192.168.0.33    192.168.0.10    ICMP    74  0x8d96 (36246)  Echo (ping) reply    id=0x0002, seq=20474/64079, ttl=64 (request in 25)
0000   50 3e aa 05 64 f7 a6 d0 5b 4a 8f 76 08 00 45 00
0010   00 3c 8d 96 00 00 40 01 6b af c0 a8 00 21 c0 a8
0020   00 0a 00 00 05 60 00 02 4f fa 61 62 63 64 65 66
0030   67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76
0040   77 61 62 63 64 65 66 67 68 69

On the other hand, capture icmp packet in PC1 shows that wlp1s0 receives no icmp packet but br0 do:

root@dlw:/home/dlw# tcpdump -i wlp1s0 -XXXX icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wlp1s0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured

root@dlw:/home/dlw# tcpdump -i br0 -XXXX icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br0, link-type EN10MB (Ethernet), capture size 262144 bytes
20:28:32.496062 IP 192.168.0.10 > dlw: ICMP echo request, id 2, seq 20555, length 40
    0x0000:  a6d0 5b4a 8f76 503e aa05 64f7 0800 4500  ..[J.vP>..d...E.
    0x0010:  003c c665 0000 8001 f2df c0a8 000a c0a8  .<.e............
    0x0020:  0021 0800 fd0e 0002 504b 6162 6364 6566  .!......PKabcdef
    0x0030:  6768 696a 6b6c 6d6e 6f70 7172 7374 7576  ghijklmnopqrstuv
    0x0040:  7761 6263 6465 6667 6869                 wabcdefghi
20:28:32.496109 IP dlw > 192.168.0.10: ICMP echo reply, id 2, seq 20555, length 40
    0x0000:  503e aa05 64f7 a6d0 5b4a 8f76 0800 4500  P>..d...[J.v..E.
    0x0010:  003c e53f 0000 4001 1406 c0a8 0021 c0a8  .<.?..@......!..
    0x0020:  000a 0000 050f 0002 504b 6162 6364 6566  ........PKabcdef
    0x0030:  6768 696a 6b6c 6d6e 6f70 7172 7374 7576  ghijklmnopqrstuv
    0x0040:  7761 6263 6465 6667 6869                 wabcdefghi

Besides, wlp1s0 can not reach PC2 in reverse:

dlw@dlw:~$ ping 192.168.0.10 -I wlp1s0
PING 192.168.0.10 (192.168.0.10) from 192.168.0.33 wlp1s0: 56(84) bytes of data.
^C
--- 192.168.0.10 ping statistics ---
6 packets transmitted, 0 received, 100% packet loss, time 5121ms

dlw@dlw:~$ ping 192.168.0.254 -I wlp1s0
PING 192.168.0.254 (192.168.0.254) from 192.168.0.33 wlp1s0: 56(84) bytes of data.
64 bytes from 192.168.0.254: icmp_seq=1 ttl=64 time=4.48 ms
64 bytes from 192.168.0.254: icmp_seq=2 ttl=64 time=4.81 ms
^C
--- 192.168.0.254 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 4.477/4.645/4.813/0.168 ms
dlw@dlw:~$ ^C

The route table, arp and fdb table in PC1 is following:

root@dlw:/home/dlw# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
169.254.0.0     0.0.0.0         255.255.0.0     U     1000   0        0 wlp1s0
192.168.0.0     0.0.0.0         255.255.255.0   U     0      0        0 br0
192.168.0.0     0.0.0.0         255.255.255.0   U     600    0        0 wlp1s0
root@dlw:/home/dlw# arp -a
? (192.168.0.17) at 40:8d:5c:21:db:57 [ether] on enx00e04c369b80
? (192.168.0.17) at 40:8d:5c:21:db:57 [ether] on br0
? (192.168.0.10) at 50:3e:aa:05:64:f7 [ether] on br0
root@dlw:/home/dlw# brctl showmacs br0
port no mac addr        is local?   ageing timer
  1 00:e0:4c:36:9b:80   yes        0.00
  1 00:e0:4c:36:9b:80   yes        0.00
  1 40:8d:5c:21:db:57   no        65.37
  2 50:3e:aa:05:64:f7   no        24.93
  2 88:00:66:99:97:d7   yes        0.00
  2 88:00:66:99:97:d7   yes        0.00

The arp and route table in PC2 is:

arp -a
192.168.0.17          40-8d-5c-21-db-57
192.168.0.30          a6-d0-5b-4a-8f-76
192.168.0.33          a6-d0-5b-4a-8f-76
192.168.0.255         ff-ff-ff-ff-ff-ff
224.0.0.2             01-00-5e-00-00-02
224.0.0.22            01-00-5e-00-00-16
224.0.0.251           01-00-5e-00-00-fb
224.0.0.252           01-00-5e-00-00-fc
255.255.255.255       ff-ff-ff-ff-ff-ff

route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
255.255.255.255 0.0.0.0         255.255.255.255 U     0      0        0 eth0
224.0.0.0       0.0.0.0         240.0.0.0       U     0      0        0 eth0
0.0.0.0         192.168.0.10    255.255.255.255 U     0      0        0 eth0
192.168.0.255   0.0.0.0         255.255.255.255 U     0      0        0 eth0
192.168.0.0     0.0.0.0         255.255.255.0   U     0      0        0 eth0
192.168.0.10    0.0.0.0         255.255.255.255 U     0      0        0 eth0

So the “problem” is a little bit clear now:
When PC2 ping 192.168.0.33, first ask where 192.168.0.33 is by arp broadcast, this packet reaches PC1 neighboring subsystem. Since wlp1s0 do have IP 192.168.0.33, PC1 response this arp broadcast with wlp1s0’s mac, but somehow this arp response packet’s “Sender mac” and ether-header “Source mac” is modified to br0(This is what confused me).

After get arp response, PC2 send ICMP message with Dest(Destination) mac as br0 and Dest ip as wlp1s0. PC1’s icmp hanlder(assume it worked in layer 3) responded, but route subsystem takes the low metric one “192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 br0” and send out the packet to br0(if not misunderstand).

So, my confusion is why Linux neighboring subsystem response this arp broadcast even PC2 and wlp1s0 are not linked in layer 2? And how the arp resolution flow goes?(sorry I am not familiar with Linux neighboring subsystem implementation code till now).

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.