MetalLB is correct. Playing level 2 addressing games means that only one host can receive unicast traffic at once. Per service address.
Say 2001:db8:c0ba:4816::a
is the service address and is currently pointing to a NIC at Ethernet 6E:17:C2:2E:F4:A4
. A failure in that host triggers a failover. Some neighbor discovery happens and now it points to a different host with 6E:17:C2:2E:E7:B8
. There is no opportunity to multi path, the HA protocol and the unicast workload are too simple for that. Sure could have more service addresses, so add 2001:db8:c0ba:4816::b
which could go to another, possibly unused, host.
Active/passive setup like this will be familiar to users of VRRP or PowerHA clusters. Except MetalLB reimplemented their own thing for some reason.
MetalLB BGP mode is different, layer 3 routing. Which makes ECMP possible if multiple next hops are installed for the service address route. Compare to designs for large multiple tier load balancers using ECMP.
One active host per service IP may not be a problem, depending on design. Hosts can scale up quite large, perhaps with 25 Gb links. If necessary, doing real work could be moved to other hosts, leaving just a proxy to terminate front end connections.