Score:2

Ordering of Interface Configuration with systemd-networkd

pl flag

I'm using Ubuntu 20.04, with systemd-networkd and Netplan. I have two physical interfaces (ens3 and ens4) which are configured by DHCP (with reservations, so I always get the same addresses).

Additionally, I have two tunnel devices. These are outside Netplan/networkd control (they're created by Strongswan, but for all intents and purposes they're created manually by running something like ip tunnel add...). These tunnel devices have an ip route added to send traffic to them. When initially created, these work fine, but systemd-networkd will eventually remove the routes.

To counter this, I have successfully configured the tunnel devices in systemd-networkd but the route fails to be created because it is attempted before ens3/ens4 are configured (I see tunnel1: Could not set route: Invalid prefsrc address. Invalid argument in syslog). I have confirmed the ordering by switching on debug logging.

I can add the route manually:

ip route add 10.0.32.0/20 dev tunnel1 scope link src 10.0.16.170 metric 100

...which works fine, but will be removed at some later time by systemd-networkd.

The documentation says "All configuration files are collectively sorted and processed in lexical order, regardless of the directories in which they live.", so I had a look for other config files, and found these in /run/systemd/network:

10-netplan-ens3.link
10-netplan-ens3.network
10-netplan-ens4.link
10-netplan-ens4.network

I've tried naming my netdev and network files as 99-tunnel1.netdev or zzzz-tunnel1.netdev etc, and even tried with 00- etc too. No matter what I do, it always seems that ens3 and ens4 are configured after the tunnel interfaces, and so the route always fails to add.

I have also tried configuring my devices in Netplan. It makes some things tricky, but ultimately has the same problem. Even though it creates files like 10-netplan-tunnel1.network (which are lexically after the ens3/ens4 files), they're still applied in the wrong order by networkd.

I'm sure I'm missing something here, but I can't see what. Any ideas?

My tunnel1.netdev looks like this:

[NetDev]
Name=tunnel1
Kind=vti
MTUBytes=1419

[Tunnel]
Remote=1.2.3.4
Local=2.3.4.5
Key=100

...and the .network looks like this:

[Match]
Name=tunnel1

[Link]
RequiredForOnline=no
MTUBytes=1419

[Address]
Address=169.254.102.162/30
Peer=169.254.102.161/30

[Route]
Destination=10.0.32.0/20
PreferredSource=10.0.16.170
Metric=100
Scope=link
us flag
The expectation is that networkd does not touch configuration, including routes, on devices that it has not been told to. But I wonder if it's the kernel which is auto-deleting the route for you? I notice that the route in question has a 'src' specified that is not an IP listed as being associated with the interface. (And, therefore, I would not expect this route to actually function as defined.) What is the behavior you expect from this, given that 10.0.16.170 is not an IP address on the interface?
pl flag
I agree that networkd shouldn't be messing with things it's not configured for. However, yes, the route includes an src which is on a networkd managed interface, so the route would have to come out when the interface is "downed" (but otherwise, the route is necessary, and works fine). The curious thing is that when configuring `ens` and `tunnel` (vti) devices, networkd insists on doing the tunnels first - which I can't imagine would ever be correct, especially not if configured to do them last - and I don't seem to be able to change that behaviour.
Nate T avatar
it flag
could you write out a shell script to configure in a manual style, and then set up a cron job to check for a route. If no route is located, then could it not run the script? or reconfigure in whatever way you choose? I use network mgr via nmcli for reconf. , but mine is spotty, and this is how I handled it.
pl flag
Thanks for the thought, but I'm not a fan of the cron script solution - if the route is removed just after the script runs, then there'll be a minute of downtime until the script runs again. Say nothing of running the script needlessly thousands of times. It's the solution of last resort, and I'd think pulling systemd out would come before it.
Score:1
cn flag

I think we have two problems here:

1/ The removal of your src-route might be due to intermittent carrier loss on the ens3/4 interface. When the interface goes down (even if just briefly) it flushes the IP address and also the src-routes related to this IP address. It then reconfigures the IP via DHCP but lost the src-route that you manually added. Try creating an config override drop-in, e.g.: /etc/systemd/network/10-netplan-ens3.network.d/override.conf:

[Network]
ConfigureWithoutCarrier=true
IgnoreCarrierLoss=true

2/ systemd-networkd processes the .network files in lexical order, but the DHCP provided IP address is only received asynchronously after the DHCP lease is received. networkd does not block the configuration of the other interfaces (i.e. your tunnel interface) on this DHCP response, therefore the route cannot be added, as that src IP does not yet exist at that point in time.

You say that you have a configuration that always provides you with the same IP address via DHCP. Why don't you specify this very same IP address statically (e.g. addresses: [10.0.16.170/30] in netplan – or whatever the netmask is)? That way networkd should be able to add your PreferredSource= address without problem and reconfigure it after carrier was lost.

pl flag
Loss of carrier sounds like a very plausible cause for the problem - I hadn't thought of that, and would indeed cause all the problems you mention. The non-blocking nature of networkd does indeed look to be my problem - I presume there's no way to say "wait until configured" or "don't do this until X is done", but I could indeed use a fixed IP - that would almost certainly solve the problem (although raises others, namely DNS resolution - but that too is solvable quite easily in my environment). Thanks so much for the ideas - really helpful (technically, and for my sanity!)
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.