Score:14

Block 1.4 million IP addresses on VPS

be flag

How can I block a list of about 1.4 million IP addresses? I've already tried to do it with iptables PREROUTING, like:

-A PREROUTING -d IP_HERE/32 -j DROP

But with this many records, my bandwidth goes down like crazy when I do a speedtest.

Without blocked IPs in iptables:

1 Gb/s

With blocked IPs in iptables:

3 Mb/s at peak.

I want to use XDP_DROP like here (last step): https://blog.cloudflare.com/how-to-drop-10-million-packets/

But I don't have an idea how to use this. :/ (I'm really bad at programing)

Are there alternatives to this approach?

pk flag
Can we ask why you want to block 1.4 million IPs? That's a lot of IPs. Might be easier to make sure your server is secure instead.
peterh avatar
vn flag
There is a new thing named **ipset**. I do not know it, but it might worth a try. It is the new firewall framework in linux, actually iptables today is only a compat layer over ipset.
aq flag
If you are trying to block IPs based on location/country, please say so, there are solutions to this that don't involve millions of iptable entries.
pk flag
also please don't block IPs based on location/country without a very good reason. Not just "oh there are hackers in that country"
ilkkachu avatar
co flag
@peterh, do you mean nftables? I think ipset has existed for a while, and AFAIK it's only about rules that involve, well, a set of addresses
TooTea avatar
cn flag
@peterh ipsets are included in the kernel since [2.6.39](https://kernelnewbies.org/Linux_2_6_39#IPset), released ten years ago. They already existed before that as an external patch.
user26742873 avatar
cn flag
@user253751 Maybe the op is blocking the whole EU for the cookies law? :P
peterh avatar
vn flag
@TooTea Ok, thanks. What is interesting to me, how the ipset matches an ip to an ipset (which is likely a set of ips with mask). Does it use a tree or hash internally? If yes, it will be very fast. With iptables, the only way to match an ip to a set of rules is linear search because Turing.
jcaron avatar
me flag
Are the IP addresses really individual IP addresses, or are they part of a limited number of ranges?
be flag
IP's are individual and most are proxies
mckenzm avatar
in flag
Of course it should be a hashed (or it will be otherwise pretty unbalanced) index. "lightning speed" according to the reference in @Cyrbil s answer.
Score:32
in flag

You should have a look into ipset.

From the official website:

Ipset may be the proper tool for you [...] to store multiple IP addresses or port numbers and match against the collection by iptables.

[...] (Ipset) may store IP addresses, networks, (TCP/UDP) port numbers, MAC addresses, interface names or combinations of them in a way, which ensures lightning speed when matching an entry against a set.

To use it, you need to create an ipset, add the IPs and create an iptables rule to match with the ipset:

ipset create blacklist hash:ip hashsize 1400000
ipset add blacklist <IP-ADDRESS>
iptables -I INPUT -m set --match-set blacklist src -j DROP

A real life example of usage can be found here. Notice that it uses ipset restore instead of going through each IP in a loop because it’s much more faster.

If your list of IPs has overlaps, you may want to preprocess it to convert to IP ranges where possible. Here is an example of a tool to do it. It won't get you better performances with ipset but it will reduce the size of your list.


On a side note, in term of performances, it is very fast and scale without penalty. As the Cloudflare's blog mention, there are faster low level approaches; but it's much more complex and only adds a few bytes per seconds, which, unless you have the scale and ambition of a cloud provider, are not worth the effort.

iBug avatar
tr flag
Processing single IPs into ranges is definitely a must. Then you can use `hash:net` for the set and even better performance.
Score:20
za flag

Frame challenge - what's the shorter list, authorised or blocked addresses?

Rather than denying 1.4 million, simply allow the perhaps ~dozen IPs you want to permit, and default-deny everything.

fr flag
This sound more like he wants to block a predefined set of "Bad IPs". For most applications a whitelist system will probably not be useful
Score:15
in flag

If the IP addresses operate in a well-defined range, then you can use ufw like this to block traffic:

sudo ufw deny from 192.0.0.0/8 to any

The example above blocks all traffic from 192.0.0.1 to 192.255.255.254, which works out to 16,777,214 addresses and this has zero (noticeable) effect on network throughput.

So long as your IP list is in a workable fashion to generate IP ranges, this may work for you.

mx flag
Let us all assume that OP wants to block addresses that are not in a range.
iBug avatar
tr flag
UFW is, as it describes itself, a frontend for iptables. This makes its performance even worse than manually maintaining iptables chains.
in flag
Why _worse_? It's not like the iptables rules are calling out to ufw, it's just a frontend for configuring them in the first place. Obviously it won't be _better_ either, though.
iBug avatar
tr flag
@Useless UFW creates more chains for every packet to traverse, whereas manually maintained rules can be *much* simpler and thus more performant.
in flag
So say "ufw generates overly-simple rules and you can hand-craft better ones" or something. UFW's own performance isn't an issue, and the fact that it's a frontend doesn't automatically make its rules bad. It's no worse than hand-written rules that _don't_ make clever use of chains.
Nate T avatar
it flag
@useless since it is a frontend, every time a request shows up. it is processed by `ufw` which then calls on `iptables` behind the scenes. Once `iptables` has matched ip against the listed rules, It has to pass this info back to `ufw` which would allow or deny. This is the typical fe / be flow. Cutting out `ufw` eliminates half the steps. That said, performance inc / dec would depend not on how many ips are being blocked, but how many requests are actually coming in.
in flag
@NateT I already pointed out that that's absolutely not true. `ufw` just provides a simple frontend for `iptables`, which itself just configures rules in the `ip_tables` netfilter module. Packet filtering activity never flows from the `ip_tables` kernel module out to the `iptables` userspace component, much less the `ufw` frontend for that.
Nate T avatar
it flag
Then it is NOT A FRONTEND. App 1 updating the app 2 data store does not make app 1 a "front end" for app 2. If both apps are not being called on in the order I described, you can call it whatever else you want (call it "Nancy in her red dress" for all I care), but calling it a front end in a discussion about speed-of-algorithm is never a good idea. Users of this network are always catching flak for being too picky about terms, but ^this^ is what a slightly misused term can do. @useless
Nate T avatar
it flag
If I am misunderstanding, please describe the flow of events. If both apps are being used, they are both taking up memory and temporal resources, even if they are only printing hello world to the console. You know what? Ill look it up so you dont have to type it. I'm curious now anyway. The only exception I can think of is the case where iptables is not called at all and only its data is used / updated. In that case, ^^^
Score:13
jp flag

You can minimize look-ups to gain more speed by tree-structuring your rules. You can for example do it based on the first part of the IP i.e. /8 like so:

iptables -N rule8_192_0_0_0
iptables -N rule8_172_0_0_0
iptables -N rule8_10_0_0_0

iptables -A INPUT -s 192.0.0.0/8 -j rule8_192_0_0_0
iptables -A INPUT -s 172.0.0.0/8 -j rule8_172_0_0_0
iptables -A INPUT -s 10.0.0.0/8 -j rule8_10_0_0_0

iptables -A rule8_192_0_0_0 -s 192.168.2.3 -j DROP
iptables -A rule8_172_0_0_0 -s 172.16.2.3 -j DROP
iptables -A rule8_10_0_0_0 -s 10.10.2.3 -j DROP
Score:4
tr flag

There's another improvement that directly solves your 3 Mb/s problem:

iptables -I INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT

This allows established connections to traverse as few iptables rules as possible, although using ipset to improve the IP address lookup speed is still necessary for new connections to establish faster.

No matter how many other rules you have, this is a good one to deploy as the first rule.

Score:1
fr flag

XDP_DROP is probably overkill unless you plan on running these blocklists at extremely high packet speeds (Think >1mpps). As such i would recommend Cyrbil's anwser if you aren't that experienced with code.

If you nevertheless want to try with XDP you are looking for something called a bloom filter which is able to quickly check if a ip is "possibly in set" or "definitely not in set"

A example of a bloom filter in C: This blog post

Score:1
us flag

This doesn't use iptables but the ip kernel routing table, it may be worth trying it and check for performances:

ip route add blackhole IPv4/32

IIRC it's supposed to be faster than filtering with iptables, but I've never done a benchmark with 1.4 million IPs :)

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.