Score:0

Is random response from DNS authoritative server normal

ng flag

We are trying to setup Lets Encrypt certificate issuance using cert-manager and dns01 solver. We are using Dreamhost as our DNS provider and we created glue component that bridges between RFC-2136 cert-manager and Dreamhost API.

We are experiencing issue that although the required (TXT) records are being added the authoritative DNS server are returning them at random - with two requests one after another one returning answer and other returning nothing. This causes cert-manager to wait unpredictable amount of time until by luck it is able confirm presence of TXT record and in some cases causes Lets Encrypt domain check to fail.

Is this something to be expected from DNS authoritative server? Or is that something we should raise with our DNS provider?

Bellow is example of such situation (for now the sepcific domain is retracted). As you can see

dig @ns1.dreamhost.com. _acme-challenge.keycloak.tenant-a.k8s-dev.redacted.redacted. TXT 

; <<>> DiG 9.16.15-Ubuntu <<>> @ns1.dreamhost.com. _acme-challenge.keycloak.tenant-a.k8s-dev.redacted.redacted. TXT
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 38336
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;_acme-challenge.keycloak.tenant-a.k8s-dev.redacted.redacted. IN    TXT

;; AUTHORITY SECTION:
redacted.       300 IN  SOA ns1.dreamhost.com. hostmaster.dreamhost.com. 2022060209 18661 600 1814400 300

;; Query time: 20 msec
;; SERVER: 162.159.26.14#53(162.159.26.14)
;; WHEN: czw cze 02 19:40:17 CEST 2022
;; MSG SIZE  rcvd: 152

dig @ns1.dreamhost.com. _acme-challenge.keycloak.tenant-a.k8s-dev.redacted.redacted. TXT 

; <<>> DiG 9.16.15-Ubuntu <<>> @ns1.dreamhost.com. _acme-challenge.keycloak.tenant-a.k8s-dev.redacted.redacted. TXT
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 41439
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;_acme-challenge.keycloak.tenant-a.k8s-dev.redacted.redacted. IN    TXT

;; ANSWER SECTION:
_acme-challenge.keycloak.tenant-a.k8s-dev.redacted.redacted. 300 IN TXT "Uyr1nHC1CRWQcmOWDvObc4RMd-mNhKaE9bbNZTf3L2k"

;; Query time: 16 msec
;; SERVER: 162.159.26.14#53(162.159.26.14)
;; WHEN: czw cze 02 19:40:18 CEST 2022
;; MSG SIZE  rcvd: 144

As You can see

  • Timestamps of responses are lmost the same
  • One response contains TXT records, other do not
  • The same dns server (162.159.26.14 - ns1.dreamhost.com) is responding and I believe this is authoritative server
Appleoddity avatar
ng flag
It is something to expect, if there is a replication delay or a TTL that has to expire. How long have you waited? In addition have you tried querying each DNS server directly to identify the one that does or does not have the record to confirm what you suspect is actually true?
Patrick Mevzek avatar
cn flag
@Appleoddity "It is something to expect, if there is a replication delay or a TTL that has to expire. " That doesn't come into play on authoritative servers, at least not the TTL part. As for the "replication" it is up to how the configuration is done on the set of authoritative nameservers but any good DNS provider should guarantee loose consistency in matter of minutes if not seconds.
Patrick Mevzek avatar
cn flag
"with two requests one after another one returning answer and other returning nothing." Can you show exactly the requests being done? Better, even with the real names involved? Have you checked your DNS configuration in DNSViz?
user1686 avatar
fr flag
Keep in mind that the "same" server could internally be anycasted and/or load-balanced to several independent hosts, some of which could be in sync with Dreamhost's authoritative DNS database while others could be out-of-sync.
AGrzes avatar
ng flag
@user1686 I think it may be the case - especially given that after I checked from different country (using cloud VPS) I got different pattern of answers. Would you mind expanding comment to an answer? Basically describing mechanism by which seemingly single authoritative dns server could actually be a group of out of sync servers.
Score:1
fr flag

The same dns server (162.159.26.14 - ns1.dreamhost.com) is responding and I believe this is authoritative server

Just like with web servers, a single IP address could be hiding multiple DNS servers answering your requests. Even queries from the same location could be distributed to multiple nodes through a load-balancer or a multipath route. (Sometimes the "ns1" and "ns2" are entirely for show, with both leading to the same pool of servers.)

Yes, ideally all authoritative servers in the cluster should know the exact same data, but depending on how they're implemented, the cluster could have some servers temporarily out-of-sync with Dreamhost's authoritative database for various reasons. For example (entirely hypothetical – I don't know how Dreamhost's systems work), "reload" requests might be spread in time to reduce load on the database.

after I checked from different country (using cloud VPS) I got different pattern of answers

On a broader scale, when querying from different locations, BGP anycast could lead you to entirely different clusters. A good example is either public DNS resolvers or the root servers – there are many instances of 1.1.1.1 around the world, and there are many instances of "f.root-servers.net". If Dreamhost uses anycast to host "ns1" at multiple physical locations (which they probably do, for reduced latency), then it is even more likely that they will be out-of-sync for a short period of time after you make changes. (That's one area where "DNS propagation" isn't a lie.)

Many DNS servers support the special hostname.bind and/or id.server queries to which they reply using their individual names. Try this from different locations:

dig +short @ns1.dreamhost.com hostname.bind. chaos txt

But overall, none of the above really changes things – your problem is not much different from having actually separate servers and keeping them in sync. For example, even if you just had ordinary ns1/ns2/ns3 servers using traditional DNS AXFR replication, whenever new data is loaded into the master server it can take a few seconds for it to send out NOTIFYs to replicas and for them to transfer the changes. Resolvers looking at your NS records are completely unaware of this and could randomly select either a server that already has the new data or a server that does not.

So regardless of how your DNS host works and how many servers it has, you should never expect updates to be 100% instant; find out whether the provider publishes the expected time, or just wait an arbitrary 5 or 10 seconds.

AGrzes avatar
ng flag
I will accept this answer in some time if nothing more on point comes around. One thing is that what we are experiencing is that the DNS responses doe not stabilize for many minutes - it is hard to tell exactly but usually in few hours DNS verification succeed either by all servers returning correct records - or by returning long enough sequence of correct records by chance. So it is not that I expect it to work instantly - but I would hope to sync of authoritative servers to work in say 15 minutes.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.