Score:1

What exactly can I conclude from "High scoring spam message has been dropped (in reply to end of DATA command)"?

cn flag

I operate a web site that offers user registrations. Users get automatic registration success e-mails, sent through a professional e-mail hosting company (not directly from our own server). This has worked well for months.

Today for the first time a registration success e-mail has bounced with the error message:

host ghost.mxroute.com[49.12.120.198] said: 550 High scoring spam message has been dropped (in reply to end of DATA command)

I know in general that this means that the recipient's e-mail server classified our registration e-mail as spam, and also the general tips on how to avoid this, like setting up DMARC, avoiding "spammy sounding" wording and stretching the sending of e-mails over time. In fact, our registration confirmation e-mail scores 10/10 on https://www.mail-tester.com/.

Now I am trying to figure out why exactly my e-mail was classified as spam, and I am unsure what exactly the error message tells me. Specifically, I have these two questions:

  1. Does "high scoring spam message" specifically mean that the content of the e-mail was classified as spam, or could this just as well point to any of the other possible reasons (like the sending server's IP address being blacklisted etc.)?
  2. What does "in reply to end of DATA command" mean? Specifically, what is the "DATA command"?
jp flag
Dom
You should ask to the mail hosting company. I think the mail you try to send is considered as spam because of multiple criteria. The content, the headers, all is checked by the antispam. The DATA is in the the SMTP protocol. It is the step after getting the DATA from you and before allowing the mail to continue its way.
anx avatar
fr flag
anx
*Score* often also implies there will be a gold mine of diagnostics in the headers of a (successfully) received mail. If you can get your hands on headers of a mail received by that provider (disposition notification, forwarded as attachment, ..), you will find very descriptively named clues.
Score:2
jp flag

When your machine wants to submit an email to the receiving MX server, this is a process of several steps.

  1. Connecting. Theoretically, a receiving MX server may even refuse a connection from your sending server, for example, if your sending server's IP was on a blacklist. If you pass this and are allowed a connection, next step is

  2. Handshaking. Your machine is supposed to send a HELO / EHLO, receive a list of capabilities of the receiving server and act accordingly. It may be that the receiving MX server does not like something it sees and terminates the connection. After you have passed this, your machine will send the so-called Envelope informattion, which containes especially two header fields:

  • FROM: the sender's email address
  • RCPT TO/CC/BCC: the receiver's email address(es)

This is where many connections will get terminated by design, for example, if you try to submit an email to the MX server with a recipient the receiving server does not handle or relay for. This usually results in some kind of "relay not permitted" error.

If your connection is not yet terminated up to here, your machine will start the DATA command and submit the actual content (body) of the email.

This is in your example where it breaks.

So in other words: The receiving MX does not like something in the body content of the mail being sent.

anx avatar
fr flag
anx
In my experience the last part describes only what a recipient *optimized to reject as early as possible* would do - in reality many will let your transmit the whole body before rejecting it for reasons they *could* have known before the body was sent. And why let the sender wait while querying some reputation database, if we can still defer or reject messages for any reason (or no reason at all) in reply to EOD?
Score:1
fr flag
anx

High scoring is the key word telling you a score-based spam filtering software is used. It does not necessarily tell you that any of the reasons used to calculate the score is to be found in your body, it just means that there are multiple reasons (or less likely a single one the recipient does not want to tell you).

How could multiple reasons together lead to a message being rejected?

If your top level domain, your mail provider, and certain keywords in your mail all have never been associated with non-spam messages, the recipient might add a score of 2 points for each, and then decide a score of 6 is "high" enough for instant rejection. That is how widespread spam filtering usually works: adding up (possibly-automatically) fine-tuned values for certain indicators of spam to a compound score, and then deciding which score is enough to justify unattended action.

Whether that score is calculated to have positive values mean less likely unwanted messages or the other way around is merely an implementation detail. AFAIK, the way the test service your mentioned calculates it, assigning a high score to least number of indicators of unauthorized or junk mail is less common.

How could the rejection come after the DATA command, if none of the mail content was even used?

That can just be the way the admin setup the system. If the score can include adjustments based on message content, then it may not make sense to judge the headers first, and then the complete mail again after DATA has been received. It is often simpler to scan the mail once, after all data is available, without optimizing for the special case where indicators in the headers could not possibly be made up for by body content.

How could a single reason be enough, but the message deliberately not saying so?

If your mail contains a header like X-Sending-Software: WordPress addon xyz v0.2 and I know that this means the mail was sent either from a server that is not maintained, or likely from a server that was abused to send mail because of a specific, known vulnerable version of a web application, then letting the spammer know how I detected him only serves to tell the spammer how to avoid my crude but effective detection. In that case, I will mimic the message of my scoring system, even though no multi-component score was even used.

(The other answer explains the meaning of the DATA command)

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.