Score:2

DRBD config to automatically resolve any split brain

cn flag

I'm running 2 node cluster with DRBD (active/passive) managed by the drbd systemd service and a small script that mounts volumes whenever a resource becomes primary.

I want to configure the DRBD in such a way that it will always resolve any split brain and always have at least one node which is primary and able to serve in any case as long as not both machines are down.

I tried the following configuration (where pri-lost-after-sb is "reboot")

after-sb-0pri discard-younger-primary;
after-sb-1pri discard-secondary;
after-sb-2pri call-pri-lost-after-sb;

and on-suspended-primary-outdated force-secondary and some other combinations.

But I always find a scenario where the cluster gets into bad states and doesn't recover from a split brain. Usually I'm getting StandAlone on the nodes and force-io-failures on the secondary (so after another fail of the primary, this secondary will not work even if connected).

Is there anything else I can do to improve the robustness of this setup considering I highly prioritize service uptime and not the avoidance of data loss?

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.