Score:0

e-Mail greylisting; Why including IP address?

ae flag

In all the information on the internet I could find about greylisting I find the information that the following tripplet is used to uniquely distinguish an incoming e-mail:

  1. Source IP
  2. Source e-mail address
  3. Destination e-mail address

Now the source IP can make problems because large mail services use multiple IP addresses (possibly from complete different IP ranges) to re-send blocked e-mails.

Question

Why is it at all relevant to consider the source IP address? Why not just use source and destination e-mail address as the key to identify a given e-mail (sender-receiver link)?

Why not instead using the subject to more uniquely identify a specific e-mail?

Reasoning

Even after doing quite some thinking about what kind of problems could arise when just ignore the source IP I didn't find any reason where the source IP could be relevant.

  • When the same IP sends two times from the same source to the same destination e-mail address (and wait the required time), the e-mail is delivered
  • When two different IPs send from the same source to the same destination e-mail address, the e-mail also should be delivered (e.g. large mail services)
  • Some greylisting solutions allow for a subnet mask for the source IP. But this is very unsharp and does not accommodate for all situations - especially not for ultra-large mail services with MTAs standing in completely different subnets.
  • What about a legitimate mail-sender who sends 2 different e-mails within the "try later time period" to the same destination e-mail address the first time?
  • Using the tripplet: Source and destination e-mail address and subject should theoretically more accurately treat each individual e-mail with the greylisting - even when coming from the same sender to the same receipient.

But my main question is: Why at all include the source IP in the tripplet? (The chance that 2 different external entities will send with the same source e-mail address to the same destination e-mail address seems extremely unlikely to me)

anx avatar
fr flag
anx
Why do you think a "large service" provider has any better excuse to deliberately sneak around source-IP-based spam fighting techniques? If they (more than incidentally) dance through their IP space, I would argue they are trying to mitigate, in the entirely wrong place, the effect of having a bad sending history on some of their addresses.
ae flag
I fully agree. But real life shows that they are doing it. Otherwise greylisting filters would just work with the single source IP - but many do with an adjustable subnet mask - actually allow for a certain adjustment for such cases.
Score:1
vn flag

One of the things that is done with most greylisting techniques is checking if the specified source address is authorized to send from that IP, which can be done with SPF records. If the source IP is listed in the source domain's SPF record as being a valid sender for that domain, many (I'd like to say most, but my experience is limited) greylist filters will auto-whitelist the email. The source IP thus is not necessarily part of the controlling triplet, but is important to know why something was greylisted, or not, and so should be retained.

ae flag
Hmm - thanks for answering at all - but this does not answer my clear question regarding the use of the source IP within the tripplet (tripplet then gets hashed) but the answer matches to: How does SPF work.
tsc_chazz avatar
vn flag
Okay, I'll add a second answer that is more direct.
Score:1
vn flag

So to more directly address "why it's in the triplet" we have to look at the most common ways that systems with multiple outbound SMTP addresses use their mail servers. A hosting group like GoDaddy will usually have a number of hosts feeding a single server, and that server will have a queue of outgoing messages. While there will be multiple servers, on multiple IP addresses, each server will have its own hosts and its own message queue. If a message is refused for greylisting, it is still queued in the same mailserver at the same IP, and so will be tied to the same IP the next time it's tried.

GoDaddy may not be the best example there, because in fact they seem to have a round-robin queue that selects among three or more Internet-facing servers for any message coming out of an intermediate server. However, although I can't be certain, what I've seen of their output emails suggests that a temporary error will not result in that message being pushed back to the intermediate server; it's the Internet-facing server that has received the temporary error, and is managing the holdback timing. So the message is still tied to the same IP address, because it's been left in charge of a specific server.

The specific case that greylisting was initially designed to trap, the infected home machine, will of course simply not try again after the message send has been attempted, and may in fact be already disconnected and off to the next server when the error message comes back - if the spammer doesn't see that the message was refused, he can probably still charge for having sent it, and those $0.000001 per message can add up.

And the subject is not really a valid key for greylisting, partly because there can be a large number of different, valid, messages with the same subject, and partly because Joe Spammer will be sending out a crap-ton of messages with the same sender and subject, from different IP addresses, in hopes that some get through.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.