sendmail 8.14.7 never trying secondary mx

Question

Score:0

Server

sendmail 8.14.7 never trying secondary mx

Karl Berry

7/15/23, 5:37 PM

I've discovered that my sendmail configuration does not always try a secondary MX host if the primary MX does not answer. Sometimes it does, more often it doesn't.

I think my questions are, 1) how does sendmail decide when to give up on a given MX and try the next one? And 2) how to debug what is (not) happening?

To work on this, I set up the name mytest.freefriends.org (my own domain) with an unroutable 10.x primary MX, and a good secondary:

mytest          IN MX 1         nonesuch.freefriends.org.
mytest          IN MX 10        goodmx.freefriends.org.
nonesuch        IN A            10.10.10.10

In the real cases, the primary MX is a regular host, reachable but intentionally not answering on port 25. Apparently some sysadmins do this to stop some spammers who never try the second MX. (I hesitate to publish the names of the domains doing this, but could provide privately.) I get the same results with my test setup as with the real cases -- sometimes my sendmail gives up on the bad primary and correctly falls over to the secondary, but more often not.

I'm using the sendmail 8.14.7 binary that is distributed with CentOS 7, on x86_64. I've customized sendmail.cf in various ways, but nothing that seems remotely relevant except possibly the timeout values, which I'll append below.

I'm sending my test mail to, e.g., karltest@mytest.freefriends.org. The /var/log/maillog entry just shows nonesuch being tried repeatedly until the 5 days are up and it bounces:

Mar 15 18:26:45 tug sendmail[26132]: 22FHPiET026128: to=<karl@mytest.freefriends.org>, delay=00:01:00, xdelay=00:01:00, mailer=esmtp, pri=293911, relay=nonesuch.freefriends.org. [10.10.10.10], dsn=4.0.0, stat=Deferred: Connection timed out with nonesuch.freefriends.org.

I'm trying to discern what's really happening with:

rm /tmp/f; sendmail -D/tmp/f -d0-99.99 -qR@mytest.freefriends.org

but the voluminous /tmp/f debug output just shows the bad nonesuch MX being tried over and over, although goodmx is found. Here's a little excerpt showing the final attempt on a given queue run:

hostsignature(mytest.freefriends.org.) = nonesuch.freefriends.org.:goodmx.freefriends.org.
...
dropenvelope 0x55db2c276ba0: id=<null>, flags=4405046<INQUEUE,NO_BODY_RETN,DELE\
TE_BCC,GLOBALERRS,METOO,IS_MIME,SPLIT>
sendq=0x55db2e364ab0=<karl@mytest.freefriends.org>:
        mailer 4 (esmtp), host `mytest.freefriends.org.'
        user `karl@mytest.freefriends.org', ruser `<null>'
        state=QUEUEUP, next=0x0, alias 0x0, uid 0, gid 0
        flags=80000182<QPRIMARY,QPINGONFAILURE,QPINGONDELAY,QRCPTOK>
        owner=(none), home="(none)", fullname="(none)"
        orcpt="(none)", statmta=nonesuch.freefriends.org., status=4.4.1
        finalrcpt="RFC822; karl@mytest.freefriends.org"
        rstatus="(none)"
        statdate=Tue Mar 15 18:28:59 2022


====finis: stat 75 e_id=NOQUEUE e_flags=4405046<INQUEUE,NO_BODY_RETN,DELETE_BCC,GLOBALERRS,METOO,IS_MIME,SPLIT>

I have not been able to catch a log with a successful message, when it falls back to the secondary. Any way to hook into that?

I suppose I could work around this with mailertable (or maybe bestmx) entries, but I don't know all the hosts that would need it. Besides, failing over to the secondary mx seems like a pretty fundamental operation (nowadays) not to be working.

I've searched around online, in the bat book, in the sendmail sources (e.g., domain.c), etc., but haven't yet found the handle. If anyone would like to email me about this instead of/as well as replying here, my address is karl (at) freefriends (dot) org.

Sorry for the long message. Thanks in advance for any clues.

# timeouts (many of these)
#O Timeout.initial=5m
O Timeout.connect=30s
O Timeout.aconnect=30s
O Timeout.iconnect=30s
O Timeout.helo=4m
O Timeout.mail=5m
O Timeout.rcpt=10m
O Timeout.datainit=2m
O Timeout.datablock=6m
O Timeout.datafinal=30m
O Timeout.rset=1m
O Timeout.quit=1m
O Timeout.misc=1m
O Timeout.command=5m
O Timeout.ident=0s
#O Timeout.fileopen=60s
#O Timeout.control=2m
O Timeout.queuereturn=5d
#O Timeout.queuereturn.normal=5d
#O Timeout.queuereturn.urgent=2d
#O Timeout.queuereturn.non-urgent=7d
#O Timeout.queuereturn.dsn=5d
O Timeout.queuewarn=2d
#O Timeout.queuewarn.normal=4h
#O Timeout.queuewarn.urgent=1h
#O Timeout.queuewarn.non-urgent=12h
#O Timeout.queuewarn.dsn=4h
#O Timeout.hoststatus=30m
#O Timeout.resolver.retrans=5s
#O Timeout.resolver.retrans.first=5s
#O Timeout.resolver.retrans.normal=5s
#O Timeout.resolver.retry=4
#O Timeout.resolver.retry.first=4
#O Timeout.resolver.retry.normal=4
O Timeout.lhlo=1m
#O Timeout.auth=10m
O Timeout.starttls=2m

17

0 + 0

sendmail

mx-record

centos7

sendmail 8.14.7 never trying secondary mx

Post an answer