We're observing some weirdness when using MAAS (version 2.8 in this case). The machines get systemd-resolved configured after deployment (using an ansible role) and from that point every system reboot starts making cloud-init hang forever during boot.
We can see on the console that it tries to reach the 10-0-0-0--25.maas-internal
FQDN and complains that it can't resolve the hostname, which obviously can only be resolved by the MAAS DNS server itself. Our working theory for now is that since we set the default DNS servers to 1.1.1.1
and 8.8.8.8
and because IPv6 might come up earlier than the cloud-init controlled IPv4 address the MAAS address is not considered for DNS resolution.
This brings me to some questions:
- Directly changing /etc/cloud/cloud.cfg.d/90_dpkg_local_cloud_config.cfg and 90_dpkg_maas.cfg replacing the endpoint/metadata_url with a FQDN that can be resolved by upstream DNS servers doesn't seem to have an effect, are those files overwritten during the PXE boot from MAAS?
- Can I convince/reconfigure MAAS to use it's FQDN instead of the
10-0-0-0--25.maas-internal
FQDN?
- Do I need to use resolveconf to make sure the MAAS IP is always the first nameserver in the list?
- Can I configure systemd-resolved to prefer the injected DNS servers on a specific interface over others?
- Is it possible that nftables (which doesn't have a outgoing rule for the MAAS port) be interfering with cloud-init here? When does nftables become active during the boot process?