My setup:
/etc/ns-shared-resolv.conf is written to regularly with nameserver x.x.x.x, updated from a script
/etc/netns/ag2/resolv.conf is a symlink to the above (along with ag3, ag4).. for central DNS settings in root netnso
- Long-running service running in
ag2 netns (via ip netns exec ag2 ..., launched from a systemd service)
What happens:
Everything works fine.. for some arbitrary number of hours. After that, DNS requests fail. Using tcpdump I can see DNS requests going to "the wrong place" .. the DNS server in root /etc/resolv.conf, NOT the netns one.
At the same time while that's not working, ip netns exec ag2 cat /etc/resolv.conf works to show the correct settings.
If I start a new ip netns exec ag2 bash shell, it gets the "correct" resolv.conf (symlink to /run/systemd/resolve/stub-resolv.conf, which is updated "live" with the contents of ns-shared-resolv.conf)
So it's like after a while, long-running processes get the root resolv.conf?
Questions:
Why is this happening / how can I diagnose how it's using the "wrong" resolv.conf / DNS server after this ranmdom amount of time?
Can I just somehow get the ubuntu default DNS systemd-resolv server working within netns-es so I don't need to do this craziness?
Edit: like this person! --> https://www.reddit.com/r/linuxquestions/comments/dnh8wq/comment/fo1tbty/?utm_source=share&utm_medium=web2x&context=3