This is a question about linux neighboring subsystem and virtual bridge(linux is new to me,sorry for that).
The test is performed on UBUNTU21(uname -a
):
Linux dlw 5.11.0-16-generic #17-Ubuntu SMP Wed Apr 14 20:12:43 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Initial Config
The Topology diagram:
||=============================================||
||---------------| |------||
PC2 --- ||enx00e04c369b80| |wlp1s0||--- PC4
||---------------| |------||
|| PC1(UBUNTU 21) ||
||---------------| ||
PC3 --- ||enx8800669997d7| ||
||---------------| ||
||=============================================||
All cards in PC1 are configured with static ip by following commands:
ifconfig enx00e04c369b80 192.168.0.31 netmask 255.255.255.0
ifconfig enx8800669997d7 192.168.0.32 netmask 255.255.255.0
ifconfig wlp1s0 192.168.0.33 netmask 255.255.255.0
and “ip_forward” is disabled by checking /proc/sys/net/ipv4/ip_forward:
cat /proc/sys/net/ipv4/ip_forward
0
PC2/PC3/PC4 are terminals with static IP and firewall is disabled.
In brief, relation between PC1 and PCx is:
---------------------------------------------
PC1-interfaces static ip direct-linked-external-device-per-port
--------------- ----------- -----------------------------------------
enx00e04c369b80 192.168.0.31 PC2 192.168.0.17 40:8d:5c:21:db:57
enx8800669997d7 192.168.0.32 PC3 192.168.0.10 50:3e:aa:05:64:f7
wlp1s0 192.168.0.33 PC4 192.168.0.254 00:85:00:07:AA:3A
After bring them up with command “ifconfig xxx up”, three cards in PC1 are separated in three subnet and can only communicate with direct-linked PCx(verified by “ping” with -I option). Specially, PC2 and PC3 and PC4 can not reach each other.
Purpose & operation
Then I want to connect PC2 and PC3 by virtual bridge, that is, add a bridge “br0” in PC1, and add interfaces enx00e04c369b80 and enx8800669997d7 to it, and wlp1s0 remain not changed.
This is reached by:
brctl addbr br0
brctl addif br0 enx00e04c369b80
brctl addif br0 enx8800669997d7
ifconfig enx8800669997d7 0.0.0.0
ifconfig enx00e04c369b80 0.0.0.0
ifconfig br0 192.168.0.30 netmask 255.255.255.0
ifconfig br0 up
this resulting in the following configuration in PC1:
root@dlw:/home/dlw# ifconfig
br0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.0.30 netmask 255.255.255.0 broadcast 192.168.0.255
inet6 fe80::a4d0:5bff:fe4a:8f76 prefixlen 64 scopeid 0x20<link>
ether a6:d0:5b:4a:8f:76 txqueuelen 1000 (Ethernet)
RX packets 440 bytes 20303 (20.3 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 41 bytes 5530 (5.5 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
enx00e04c369b80: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
ether 00:e0:4c:36:9b:80 txqueuelen 1000 (Ethernet)
RX packets 684 bytes 34203 (34.2 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 100 bytes 8901 (8.9 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
enx8800669997d7: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
ether 88:00:66:99:97:d7 txqueuelen 1000 (Ethernet)
RX packets 298 bytes 35266 (35.2 KB)
RX errors 0 dropped 4 overruns 0 frame 0
TX packets 487 bytes 32918 (32.9 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 5235 bytes 423013 (423.0 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 5235 bytes 423013 (423.0 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
wlp1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.0.33 netmask 255.255.255.0 broadcast 192.168.0.255
inet6 fe80::ea0b:12c0:2ebf:b5c6 prefixlen 64 scopeid 0x20<link>
ether 84:5c:f3:52:98:60 txqueuelen 1000 (Ethernet)
RX packets 23040 bytes 22183801 (22.1 MB)
RX errors 0 dropped 53 overruns 0 frame 0
TX packets 8987 bytes 994399 (994.3 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
root@dlw:/home/dlw# brctl show
bridge name bridge id STP enabled interfaces
br0 8000.a6d05b4a8f76 no enx00e04c369b80
enx8800669997d7
root@dlw:/home/dlw# ls /sys/class/net/br0/brif/
enx00e04c369b80 enx8800669997d7
The bridge br0 is worked normally by “ping” each other in PC2 and PC3.
Question
But my question is, PC2 can reach PC1’s wlp1s0(by ping),
ping 192.168.0.33
PING 192.168.0.33 (192.168.0.33) 56(84) bytes of data.
64 bytes from 192.168.0.33: icmp_seq=1 ttl=64 time=1.05 ms
64 bytes from 192.168.0.33: icmp_seq=2 ttl=64 time=1.05 ms
however, the real “responser” is br0(check mac address) but not wlp1s0.This is verified by packet captured in PC2 using wireshark:
23 2021-09-30 10:56:17.231255 Tp-LinkT_05:64:f7 Broadcast ARP 42 Who has 192.168.0.33? Tell 192.168.0.10
0000 ff ff ff ff ff ff 50 3e aa 05 64 f7 08 06 00 01
0010 08 00 06 04 00 01 50 3e aa 05 64 f7 c0 a8 00 0a
0020 00 00 00 00 00 00 c0 a8 00 21
24 2021-09-30 10:56:17.231999 a6:d0:5b:4a:8f:76 Tp-LinkT_05:64:f7 ARP 60 192.168.0.33 is at a6:d0:5b:4a:8f:76
0000 50 3e aa 05 64 f7 a6 d0 5b 4a 8f 76 08 06 00 01
0010 08 00 06 04 00 02 a6 d0 5b 4a 8f 76 c0 a8 00 21
0020 50 3e aa 05 64 f7 c0 a8 00 0a 55 55 55 55 55 55
0030 55 55 55 55 55 55 55 55 55 55 55 55
25 2021-09-30 10:56:17.232010 192.168.0.10 192.168.0.33 ICMP 74 0xc634 (50740) Echo (ping) request id=0x0002, seq=20474/64079, ttl=128 (reply in 26)
0000 a6 d0 5b 4a 8f 76 50 3e aa 05 64 f7 08 00 45 00
0010 00 3c c6 34 00 00 80 01 00 00 c0 a8 00 0a c0 a8
0020 00 21 08 00 fd 5f 00 02 4f fa 61 62 63 64 65 66
0030 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76
0040 77 61 62 63 64 65 66 67 68 69
26 2021-09-30 10:56:17.232711 192.168.0.33 192.168.0.10 ICMP 74 0x8d96 (36246) Echo (ping) reply id=0x0002, seq=20474/64079, ttl=64 (request in 25)
0000 50 3e aa 05 64 f7 a6 d0 5b 4a 8f 76 08 00 45 00
0010 00 3c 8d 96 00 00 40 01 6b af c0 a8 00 21 c0 a8
0020 00 0a 00 00 05 60 00 02 4f fa 61 62 63 64 65 66
0030 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76
0040 77 61 62 63 64 65 66 67 68 69
On the other hand, capture icmp packet in PC1 shows that wlp1s0 receives no icmp packet but br0 do:
root@dlw:/home/dlw# tcpdump -i wlp1s0 -XXXX icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wlp1s0, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
root@dlw:/home/dlw# tcpdump -i br0 -XXXX icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br0, link-type EN10MB (Ethernet), capture size 262144 bytes
20:28:32.496062 IP 192.168.0.10 > dlw: ICMP echo request, id 2, seq 20555, length 40
0x0000: a6d0 5b4a 8f76 503e aa05 64f7 0800 4500 ..[J.vP>..d...E.
0x0010: 003c c665 0000 8001 f2df c0a8 000a c0a8 .<.e............
0x0020: 0021 0800 fd0e 0002 504b 6162 6364 6566 .!......PKabcdef
0x0030: 6768 696a 6b6c 6d6e 6f70 7172 7374 7576 ghijklmnopqrstuv
0x0040: 7761 6263 6465 6667 6869 wabcdefghi
20:28:32.496109 IP dlw > 192.168.0.10: ICMP echo reply, id 2, seq 20555, length 40
0x0000: 503e aa05 64f7 a6d0 5b4a 8f76 0800 4500 P>..d...[J.v..E.
0x0010: 003c e53f 0000 4001 1406 c0a8 0021 c0a8 .<.?..@......!..
0x0020: 000a 0000 050f 0002 504b 6162 6364 6566 ........PKabcdef
0x0030: 6768 696a 6b6c 6d6e 6f70 7172 7374 7576 ghijklmnopqrstuv
0x0040: 7761 6263 6465 6667 6869 wabcdefghi
Besides, wlp1s0 can not reach PC2 in reverse:
dlw@dlw:~$ ping 192.168.0.10 -I wlp1s0
PING 192.168.0.10 (192.168.0.10) from 192.168.0.33 wlp1s0: 56(84) bytes of data.
^C
--- 192.168.0.10 ping statistics ---
6 packets transmitted, 0 received, 100% packet loss, time 5121ms
dlw@dlw:~$ ping 192.168.0.254 -I wlp1s0
PING 192.168.0.254 (192.168.0.254) from 192.168.0.33 wlp1s0: 56(84) bytes of data.
64 bytes from 192.168.0.254: icmp_seq=1 ttl=64 time=4.48 ms
64 bytes from 192.168.0.254: icmp_seq=2 ttl=64 time=4.81 ms
^C
--- 192.168.0.254 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 4.477/4.645/4.813/0.168 ms
dlw@dlw:~$ ^C
The route table, arp and fdb table in PC1 is following:
root@dlw:/home/dlw# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
169.254.0.0 0.0.0.0 255.255.0.0 U 1000 0 0 wlp1s0
192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 br0
192.168.0.0 0.0.0.0 255.255.255.0 U 600 0 0 wlp1s0
root@dlw:/home/dlw# arp -a
? (192.168.0.17) at 40:8d:5c:21:db:57 [ether] on enx00e04c369b80
? (192.168.0.17) at 40:8d:5c:21:db:57 [ether] on br0
? (192.168.0.10) at 50:3e:aa:05:64:f7 [ether] on br0
root@dlw:/home/dlw# brctl showmacs br0
port no mac addr is local? ageing timer
1 00:e0:4c:36:9b:80 yes 0.00
1 00:e0:4c:36:9b:80 yes 0.00
1 40:8d:5c:21:db:57 no 65.37
2 50:3e:aa:05:64:f7 no 24.93
2 88:00:66:99:97:d7 yes 0.00
2 88:00:66:99:97:d7 yes 0.00
The arp and route table in PC2 is:
arp -a
192.168.0.17 40-8d-5c-21-db-57
192.168.0.30 a6-d0-5b-4a-8f-76
192.168.0.33 a6-d0-5b-4a-8f-76
192.168.0.255 ff-ff-ff-ff-ff-ff
224.0.0.2 01-00-5e-00-00-02
224.0.0.22 01-00-5e-00-00-16
224.0.0.251 01-00-5e-00-00-fb
224.0.0.252 01-00-5e-00-00-fc
255.255.255.255 ff-ff-ff-ff-ff-ff
route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
255.255.255.255 0.0.0.0 255.255.255.255 U 0 0 0 eth0
224.0.0.0 0.0.0.0 240.0.0.0 U 0 0 0 eth0
0.0.0.0 192.168.0.10 255.255.255.255 U 0 0 0 eth0
192.168.0.255 0.0.0.0 255.255.255.255 U 0 0 0 eth0
192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
192.168.0.10 0.0.0.0 255.255.255.255 U 0 0 0 eth0
So the “problem” is a little bit clear now:
When PC2 ping 192.168.0.33, first ask where 192.168.0.33 is by arp broadcast, this packet reaches PC1 neighboring subsystem. Since wlp1s0 do have IP 192.168.0.33, PC1 response this arp broadcast with wlp1s0’s mac, but somehow this arp response packet’s “Sender mac” and ether-header “Source mac” is modified to br0(This is what confused me).
After get arp response, PC2 send ICMP message with Dest(Destination) mac as br0 and Dest ip as wlp1s0. PC1’s icmp hanlder(assume it worked in layer 3) responded, but route subsystem takes the low metric one “192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 br0” and send out the packet to br0(if not misunderstand).
So, my confusion is why Linux neighboring subsystem response this arp broadcast even PC2 and wlp1s0 are not linked in layer 2? And how the arp resolution flow goes?(sorry I am not familiar with Linux neighboring subsystem implementation code till now).