We lease IPV4 address space (subnets) from a couple ISPs. The subnets are routed to our server farm through GRE tunnels. Our server farm services the inbound TCP connections. We do not control the GRE termination setup at the ISP end. We're not running a load balancer for the proprietary services.
The problem is that there is too much traffic trying to come over the GRE tunnels. Packet loss rates are often 1%, 2% or much higher depending on the GRE tunnel. We have tried using iptables hash limit feature, putting SYN packets (new connection requests) into buckets by source IP address, and setting arbitrary per-time-interval limits on each tunnel's buckets [*]. This has reduced packet loss rates a little on some tunnels, but packet loss rates of 1% are still pretty bad in modern data center networks.
My guess is that the ideal solution is to do the rate limiting at the far end of the GRE tunnel, because I think by the time the packet overload reaches our end of the GRE tunnel, its already causing a network clog. However we don't control the setup at the far end of the tunnel.
Because of the nature of the services, it is better that accepted TCP connections are handled efficiently than to handle more connections inefficiently. We'd rather drop connections that can't be handled speedily and make the requester come back.
What is the best approach I can take to decrease packet loss rate and improve functional performance on accepted connections? Is there some way I can send a congestion control signal back through the GRE tunnel?
Our iptables rate limiting setup looks like this:
# Create empty rate limit rule chain
iptables -t filter -N GRE2_RATE_LIMIT
# Insert rule to forward new connection packets (SYN packets) to this rule chain
iptables -t filter -I FORWARD -i gre2 -m conntrack --ctstate NEW -j GRE2_RATE_LIMIT
# Actually apply the limit
# Accept SYN packets up to a certain rate
iptables -A GRE2_RATE_LIMIT -m hashlimit --hashlimit-mode srcip --hashlimit-upto rate/time --hashlimit-burst rate --hashlimit-name gre2_rate_limit -j ACCEPT
# Reject everything else
iptables -A GRE2_RATE_LIMIT -p tcp -j REJECT --reject-with tcp-reset