There's a difference between binding to an interface and binding to an IP address. While the 2 working cases do bind to an interface, they avoid the problem that ping
encounters (to be explained later). Let's start with fixing ping
. I reproduced OP's setup to help giving illustrations.
The route lookup when only binding to an interface is not:
ip route get from 192.168.1.210
but:
# ip route get oif eno2 to 192.168.100.100
192.168.100.100 dev eno2 src 192.168.1.210 uid 0
cache
The tables 101 and 102 are not involved here, since there is no local source address specified in the lookup. Moreover, there is no default route in the main routing table for 192.168.100.100. But as the interface was forced to eno2
, such default route is automatically created... without gateway. The visible symptom is that there will be ARP requests emitted from 192.168.1.210 to 192.168.100.100 since the bogus route tells 192.168.100.100 is directly reachable.
Had OP also added the (usually useless) additional default route with higher metric, such as:
ip route add default via 192.168.1.1 dev eno2 metric 101
then:
# ip route get oif eno2 to 192.168.100.100
192.168.100.100 via 192.168.1.1 dev eno2 src 192.168.1.210 uid 0
cache
Now, since there was already a matching default route through eno2
it is selected, with a correct gateway. ping
would now work. Routing table 102 is still not involved.
Routing rule selector for bound interface
The actual correct way to have the route defined in table 102 to be used is to use the oif
selector in ip rules:
oif
NAME
select the outgoing device to match. The outgoing interface is only
available for packets originating from local sockets that are bound to
a device.
Let's use it (and delete the 2nd default route to show it's not needed anymore):
ip route delete default via 192.168.1.1 dev eno2 metric 101
ip rule add oif eno1 lookup 101
ip rule add oif eno2 lookup 102
The lookup now will match and become:
# ip route get oif eno2 to 192.168.100.100
192.168.100.100 via 192.168.1.1 dev eno2 table 102 src 192.168.1.210 uid 0
cache
This time, as the selector matched, the correct routing table was used, with a default defined with a gateway.
That's what had to be done.
Note: ping
also accepts binding to an IP address (ping -I 192.168.1.210 -c 2 google.com
) or even both (ping -I eno2 -I 192.168.1.210 -c 2 google.com
). These case would have worked without additional routing rules as explained.
Why did the two first cases work anyway?
(Remove the correction above and...)
As was seen in previous faulty route resolution:
# ip route get oif eno2 to 192.168.100.100
192.168.100.100 dev eno2 src 192.168.1.210 uid 0
cache
the correct IP source address still gets selected. As soon as TCP has to emit a packet, its route lookup will be presented with the source address 192.168.1.210. This case does match the selector in rule pref 102:
# ip route get from 192.168.1.210 oif eno2 to 192.168.100.100
192.168.100.100 from 192.168.1.210 via 192.168.1.1 dev eno2 table 102 uid 0
cache
Table 102 still got selected by rule pref 102. Once the adequate table is selected, no matter why it was selected, correct routing happens.
For UDP it's a bit more complicated, because it depends if the UDP client uses connect(2)
which then behaves like previous TCP case: a source address will be used, or chooses to not use connect(2)
. For example this command would fail to be routed correctly without the missing routing rules, because it doesn't use connect(2
) and the routing stack will be queried without source (ie: with INADDR_ANY = 0.0.0.0):
echo test | socat udp4-datagram:203.0.113.1:8888,so-bindtodevice=eno2 -
while this one would succeed because it uses connect(2)
:
echo test | socat udp4:203.0.113.1:8888,so-bindtodevice=eno2 -
Of course both will work fine once the two missing routing rules are added.
One can check that traceroute does use connect
by using strace
:
...
socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP) = 3
setsockopt(3, SOL_SOCKET, SO_BINDTODEVICE, "eno2\0", 5) = 0
...
connect(3, {sa_family=AF_INET, sin_port=htons(33434), sin_addr=inet_addr("203.0.113.1")}, 28) = 0
...