Score:0

How to improve network performance of a congested GRE tunnel?

in flag

We lease IPV4 address space (subnets) from a couple ISPs. The subnets are routed to our server farm through GRE tunnels. Our server farm services the inbound TCP connections. We do not control the GRE termination setup at the ISP end. We're not running a load balancer for the proprietary services.

The problem is that there is too much traffic trying to come over the GRE tunnels. Packet loss rates are often 1%, 2% or much higher depending on the GRE tunnel. We have tried using iptables hash limit feature, putting SYN packets (new connection requests) into buckets by source IP address, and setting arbitrary per-time-interval limits on each tunnel's buckets [*]. This has reduced packet loss rates a little on some tunnels, but packet loss rates of 1% are still pretty bad in modern data center networks.

My guess is that the ideal solution is to do the rate limiting at the far end of the GRE tunnel, because I think by the time the packet overload reaches our end of the GRE tunnel, its already causing a network clog. However we don't control the setup at the far end of the tunnel.

Because of the nature of the services, it is better that accepted TCP connections are handled efficiently than to handle more connections inefficiently. We'd rather drop connections that can't be handled speedily and make the requester come back.

What is the best approach I can take to decrease packet loss rate and improve functional performance on accepted connections? Is there some way I can send a congestion control signal back through the GRE tunnel?

Our iptables rate limiting setup looks like this:

# Create empty rate limit rule chain
iptables -t filter -N GRE2_RATE_LIMIT
# Insert rule to forward new connection packets (SYN packets) to this rule chain
iptables -t filter -I FORWARD -i gre2 -m conntrack --ctstate NEW -j GRE2_RATE_LIMIT
# Actually apply the limit
# Accept SYN packets up to a certain rate
iptables -A GRE2_RATE_LIMIT -m hashlimit --hashlimit-mode srcip --hashlimit-upto rate/time --hashlimit-burst rate --hashlimit-name gre2_rate_limit -j ACCEPT
# Reject everything else
iptables -A GRE2_RATE_LIMIT -p tcp -j REJECT --reject-with tcp-reset
Zac67 avatar
ru flag
You can fine tune MTU/MSS for tunneling, shape traffic, consider alternatives to GRE - but it'll only get you a few percent. There's no alternative to bandwidth...
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.