Score:4

Bonding dual 1 Gbit/s NICs to boost throughput to single 2.5Gbit/s port

cn flag

Linux is capable of bonding NICs together. The interesting policy for this is Round-robin, which alternates outgoing packets between each NIC.

However the performance benefits are usually limited to multiple clients. A single 1000BASE-T client, despite being fed from dual 1000BASE-T, is of course still limited to 1 Gbit/s.

What about 2.5GBASE-T clients? Assume the following:

[Server|2x1G] <===> [Switch|2.5G] <---> [Client|1x2.5G]

The dual NICs on the server "stripe" to double the bandwidth, the switch uses two ports to receive the full bandwidth and switches to a single 2.5G port.

Will the client be able to download with ~2 Gbit/s from the server? And if so, would this scenario require something "smarter" than an unmanaged switch?

Thanks!

1N4001 avatar
cn flag
@A.B Both single and multiple flows should bei interesting, but single flow is the main use case. I'm not planning on using either SCTP or MPTCP.
Zac67 avatar
ru flag
The reasonable thing would be to upgrade the server NIC. Using RR or any other kind of load balancing may likely cause out-of-order delivery, really messing up the throughput.
mx flag
@Zac67 RR and broadcast are the only ones you’re likely to see such issues with. The other modes will only route any given connection over a single physical link, so they still preserve ordering guarantees.
Criggie avatar
in flag
Your client has a 2.5 G NIC, and the switch is 2.5G - Can you update the server to a 2.5 G NIC? Added Simplicity.
Score:9
fr flag

For the single flow, any bandwidth gain in the direction from the switch to the client is highly unlikely.

Managed switches usually support port bonding according to 802.3ad protocol or static bonding, dumb (unmanaged) switches don't support it at all. The channel supports both redundancy and higher bandwidth but there is a twist. The specific port is usually selected using so called hashing policy and the policy computes a hash using variety of L2 and L3 (and sometimes also L4) fields. In most of the cases the hash will stay the same for a specific flow - meaning its bandwidth is limited to the bandwidth of the selected port (BTW, for 802.3ad all ports should have the same speed). For cheaper switches (TP-link) I've seen hashing based on L2 or L3 data, more expensive (like Dell PowerConnect) also support hashing over L4 data.

Additionally such channels are formed dynamically by some data exchange between the peers, meaning such channel is INCOMPATIBLE with balance-rr mode you mentioned.

So any bandwidth gain for single flow is HIGHLY unlikely. You may see some gain using multiple flows, depending on hashing policy the switch uses and the server configured (see xmit_hash_policy for what's available, you will need policy policy which includes L4 information to gain anything between two specific hosts).

ph flag
Do you mean "in the direction from the switch to the _server_"? Otherwise this answer doesn't fully make sense - any hashing would be done by the switch, but in the switch->client direction there is only one port, so no need for hashing.
fr flag
The hashing is done by entity sending over bond / port channel. And it doesn't matter what is on the other side of the cables, be it client, server or a switch. The important part is that the link between two network endpoints is as fast as its slowest "hop". In this case it is the bond itself and for specific flow it degrades further to the selected port in a bond.
ph flag
Yes, but in the example in the question, in the ->client direction, the server is selecting the link to send to the switch (and is using round robin, so hashing is irrelevant), and then the switch is selecting the link to send to the client (and there is only one link, so hashing is irrelevant). So no flow is being restricted to a single port.
fr flag
Assuming you have the switch which handles static channels. Please also note that the switch -> server direction MAY affect the output from the server if it gets assigned to single port. This of course depends on a traffic characteristics and switch capabilities.
fr flag
Thinking a bit more about it, it may even work on dumb switch and it can give some gain assuming there is more data flowing in server->client direction then from client to server...
ph flag
Agreed on the static channels - some switches do allow this statically (or I guess you could hack up a LACP implementation on the Linux side that convinces the switch to bundle the ports). And yes, bandwidth in the other direction can have effects, though if it's just a big download (ie much traffic in one direction, just ACKs in the other) it's not likely to be a big deal; if it's more symmetrical then it might have more of an effect.
ph flag
The reason for needing to set up some sort of bonding on the switch is that, with bonding configured on the Linux side, the switch will see lots of packets with the same source MAC address coming from two different ports. With bonding configured that's fine (the switch will associate that MAC with the aggregated link), but without, it will keep updating its MAC table to associate that address with one port or the other. Some switches may handle that fine; others will go slower (because the MAC updating is done in software); others will bring the link down, seeing this as a fault/DoS.
Score:1
mx flag

Not without using multiple flows.

With the specific exceptions of the round-robin and broadcast modes for Linux's bonding driver, neither of which will actually work correctly for your stated scenario (more on that below), all the bonding modes assign each connection/flow to a specific bound device.

This means that for single-flow use cases, they cannot provide load balancing, only fail-over.

This is done for a couple of reasons:

  • It significantly reduces the overhead associated with running over a bonded interface. If a decision had to be made for each individual packet, then that would eat into the peak performance of the bonded interface. By only making the decision per-flow, that overhead is only present for the first packet of a flow.
  • A lot of higher-level network protocols are reliant on in-order delivery of packets to function correctly. Some layer 4 protocols can guarantee this even if lower layers do not, but there are still a depressing number of applications that do not use such protocols and do not do their own sequencing.
  • A lot of things at layer 4 or lower actually need in-order delivery of packets to function efficiently. Consistently out of order delivery causes all kinds problems with TCP for example, and severely limits effective bandwidth.
  • Most smart switches and routers operate in terms of flows as well, which means that they may run into issues when dealing with packet scheduling that does not operate in terms of flows.

Because they operate in terms of flows though, you end up in a situation where any one flow is not guaranteed any more bandwidth than the lowest bandwidth bound device can provide. You can work around this by using multiple flows.

Assuming your switch plays nice with it, you probably want balance-alb mode, as it will give you the best overall link utilization spread across the links. However, some network hardware does not like how that mode handles receive load balancing, in which case you almost certainly instead want 802.3ad mode (if your switch supports it, and all the bound interfaces are connected to the same switch) or balance-xor (does the same thing, but the switch has to infer what’s going on, so does not work as well in all cases).

Why won’t balance-rr or broadcast work?

These two bonding modes are special.

broadcast mode largely exists just to provide a bonding mode that can handle the loss of a bound interface without any disruption whatsoever (active-backup mode, which provides similar fault-tolerance, will show a small latency spike if the active bound interface goes down because it has to reroute traffic and force updates of external ARP caches). It’s realistically only usable on layer 2 point-to-point links between systems that are both using the bonding driver (possibly even the same mode), and gives you no performance benefits.

balance-rr mode is instead designed to have minimal overhead, irrespective of whatever other constraints exist, and it actually does traanslate to evenly balancing the load across all bound interfaces. The problem is that if there is more than one hop below layer 3, this mode cannot provide packet ordering guarantees, which in turn causes all kinds of issues with congestion control algorithms, functionally capping effective bandwidth. It is also, in practice, only usable on layer 2 point-to-point links between systems that are both using the bonding driver.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.