Score:3

link aggregation: how?

gl flag

Link aggregation is not working for me to a point of the server not being reachable: what could be the problem, and what are the best practices in this type of a (seemingly fairly common) setup?

Dell r730 with dual 10Gb NICs running Ubuntu 22.04 (Ubuntu 22.04.3 LTS (GNU/Linux 5.15.0-82-generic x86_64)) and serving up an iSCSI target to a VMware cluster.

The NICs are connected to 10Gb link-aggregated ports on two different (but interconnected - is "stacked" the right word") Meraki MS225 switches.

In Ubuntu, the NICs are "bonded":

  renderer: networkd
  ethernets:
    enp130s0f0:
      dhcp4: no
    enp130s0f1:
      dhcp4: no
  bonds:
    bond-00:
      interfaces: [enp130s0f0,enp130s0f1]
      addresses: [<IPv4>/24]
      dhcp4: no
      routes:
        - to: default
          via: <gateway_IP>
          metric: 100
      nameservers:
        addresses: [<ns01_IP>,<ns02_IP>]
        search: [localdom.local]
      parameters:
        mode: balance-xor
        mii-monitor-interval: 1

If the ports on the Meraki switches are not link-aggregated - all is good, except the speeds are a bit slower (~40%) compared to using just one 10Gb NIC. (I was hoping that "bonding" the NICs and then configuring link aggregation in Meraki would give us speeds a bit higher than those on a single NIC.)

If the ports on the Meraki switches however are link-aggregated - packet loss of >50%, and the server becomes (almost) unresponsive.

(No special configuration in VMware. ESXi 7.0u3, the 10Gb links are active-active, and otherwise it's all default. Can't configure iSCSI network port binding in VMware because the 10Gb NICs are used for general traffic, not just iSCSI.)

What am I doing wrong?

Configurations I've tried:

  • Meraki: no special configuration, no link aggregation
    • Ubuntu: no link bonding, just two individually configured NICs each with its own IP: no issues, bandwidth ~10Gbps, only one link is used even with iSCSI multi-pathing.
    • Ubuntu: bonded links in "balance-rr", "balance-xor", "802.3ad", "balance-alb" modes: ~6Gbps (40% slower), both links are used, see no errors in Meraki.
  • Meraki: link aggregation enabled
    • >50% packet loss, and generally unusable - regardless of bonding mode in Ubuntu (tried "balance-rr", "balance-xor", "802.3ad"). (Did not try w/o bonding - as it defeats the purpose.)

(I am thinking my next step is to disable LACP in Meraki and go back to individual NICs in Ubuntu with no bonding.)

Thanks!

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.