Score:2

highly available storage over infiniband: what other than mdraid?

ng flag
aep

is mdadm over infiniband a bad idea? what is the real trick to get reasonable performing storage to survive a single machine failure?

We have been running ceph for a few years now and its great for easy (ish) redundancy, but its performance is eye watering. NVMEs easily get to 3GB/s, while our ceph is doing 100MB/s over 50Gbs network while consuming 64 core CPUs. I just don't think i made the right choice here for the performance expectations.

Infiband seems extremely cost effective in comparison with used previous gen 100gb cards costing less than a 50GB ethernet card. It seems very easy and well performing to just expose a local disk over infiniband to another host using iSER.

Now the naive solution to making this survive host failure would be mdraid over multiple remote targets. But i haven't found many people actually doing that and this answer is indicating this might even be a bad idea, since mdraid has no understanding of an underlying device being remote. Also this comment makes it clear that this setup will likely run into edge case bugs.

But how else would you build an infiniband storage network in a way that recovers from node failure unattended?

vn flag
When I see Infiniband and storage on the same question I got nervous about data at rest. We usually use RDMA for high performing storage, but for scratch, not for permanent storage, but that's our case. There's a ton of solutions, but CEPH is not RDMA friendly, BeeGFS in other hand is. CEPH is slow if we are talking the RDMA world. What exactly you want/need to achieve? When you say host failure it's including the disks? You can use shared SAS backplanes to achieve HA. Parallel filesystems like BeeGFS and LUSTRE does that.
aep avatar
ng flag
aep
this is the backend of VM disks (qemu). The setup needs to survive both disk failure (as mdraid would) and failure of an entire node. automatically rescheduling VMs is easy, but storage not so much. Ceph can do this, but at an unacceptable performance loss. BeeGFS looks great but appears to cost around 300Eur per month, so that's significantly more expensive than just buying a hardware solution. But which one do i want? Is infiniband the wrong thing to look at, because as you said, it's not usually used for storage?
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.