Score:0

sendmail 8.14.7 never trying secondary mx

in flag

I've discovered that my sendmail configuration does not always try a secondary MX host if the primary MX does not answer. Sometimes it does, more often it doesn't.

I think my questions are, 1) how does sendmail decide when to give up on a given MX and try the next one? And 2) how to debug what is (not) happening?

To work on this, I set up the name mytest.freefriends.org (my own domain) with an unroutable 10.x primary MX, and a good secondary:

mytest          IN MX 1         nonesuch.freefriends.org.
mytest          IN MX 10        goodmx.freefriends.org.
nonesuch        IN A            10.10.10.10

In the real cases, the primary MX is a regular host, reachable but intentionally not answering on port 25. Apparently some sysadmins do this to stop some spammers who never try the second MX. (I hesitate to publish the names of the domains doing this, but could provide privately.) I get the same results with my test setup as with the real cases -- sometimes my sendmail gives up on the bad primary and correctly falls over to the secondary, but more often not.

I'm using the sendmail 8.14.7 binary that is distributed with CentOS 7, on x86_64. I've customized sendmail.cf in various ways, but nothing that seems remotely relevant except possibly the timeout values, which I'll append below.

I'm sending my test mail to, e.g., [email protected]. The /var/log/maillog entry just shows nonesuch being tried repeatedly until the 5 days are up and it bounces:

Mar 15 18:26:45 tug sendmail[26132]: 22FHPiET026128: to=<[email protected]>, delay=00:01:00, xdelay=00:01:00, mailer=esmtp, pri=293911, relay=nonesuch.freefriends.org. [10.10.10.10], dsn=4.0.0, stat=Deferred: Connection timed out with nonesuch.freefriends.org.

I'm trying to discern what's really happening with:

rm /tmp/f; sendmail -D/tmp/f -d0-99.99 [email protected]

but the voluminous /tmp/f debug output just shows the bad nonesuch MX being tried over and over, although goodmx is found. Here's a little excerpt showing the final attempt on a given queue run:

hostsignature(mytest.freefriends.org.) = nonesuch.freefriends.org.:goodmx.freefriends.org.
...
dropenvelope 0x55db2c276ba0: id=<null>, flags=4405046<INQUEUE,NO_BODY_RETN,DELE\
TE_BCC,GLOBALERRS,METOO,IS_MIME,SPLIT>
sendq=0x55db2e364ab0=<[email protected]>:
        mailer 4 (esmtp), host `mytest.freefriends.org.'
        user `[email protected]', ruser `<null>'
        state=QUEUEUP, next=0x0, alias 0x0, uid 0, gid 0
        flags=80000182<QPRIMARY,QPINGONFAILURE,QPINGONDELAY,QRCPTOK>
        owner=(none), home="(none)", fullname="(none)"
        orcpt="(none)", statmta=nonesuch.freefriends.org., status=4.4.1
        finalrcpt="RFC822; [email protected]"
        rstatus="(none)"
        statdate=Tue Mar 15 18:28:59 2022


====finis: stat 75 e_id=NOQUEUE e_flags=4405046<INQUEUE,NO_BODY_RETN,DELETE_BCC,GLOBALERRS,METOO,IS_MIME,SPLIT>

I have not been able to catch a log with a successful message, when it falls back to the secondary. Any way to hook into that?

I suppose I could work around this with mailertable (or maybe bestmx) entries, but I don't know all the hosts that would need it. Besides, failing over to the secondary mx seems like a pretty fundamental operation (nowadays) not to be working.

I've searched around online, in the bat book, in the sendmail sources (e.g., domain.c), etc., but haven't yet found the handle. If anyone would like to email me about this instead of/as well as replying here, my address is karl (at) freefriends (dot) org.

Sorry for the long message. Thanks in advance for any clues.

# timeouts (many of these)
#O Timeout.initial=5m
O Timeout.connect=30s
O Timeout.aconnect=30s
O Timeout.iconnect=30s
O Timeout.helo=4m
O Timeout.mail=5m
O Timeout.rcpt=10m
O Timeout.datainit=2m
O Timeout.datablock=6m
O Timeout.datafinal=30m
O Timeout.rset=1m
O Timeout.quit=1m
O Timeout.misc=1m
O Timeout.command=5m
O Timeout.ident=0s
#O Timeout.fileopen=60s
#O Timeout.control=2m
O Timeout.queuereturn=5d
#O Timeout.queuereturn.normal=5d
#O Timeout.queuereturn.urgent=2d
#O Timeout.queuereturn.non-urgent=7d
#O Timeout.queuereturn.dsn=5d
O Timeout.queuewarn=2d
#O Timeout.queuewarn.normal=4h
#O Timeout.queuewarn.urgent=1h
#O Timeout.queuewarn.non-urgent=12h
#O Timeout.queuewarn.dsn=4h
#O Timeout.hoststatus=30m
#O Timeout.resolver.retrans=5s
#O Timeout.resolver.retrans.first=5s
#O Timeout.resolver.retrans.normal=5s
#O Timeout.resolver.retry=4
#O Timeout.resolver.retry.first=4
#O Timeout.resolver.retry.normal=4
O Timeout.lhlo=1m
#O Timeout.auth=10m
O Timeout.starttls=2m
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.