Score:2

How to MySQL replicate or cluster to setup a failover scenario?

de flag

after a massive data problem from my provider in GER, I'm now forced to deal with failover scenarios. But there are a few questions that I can't find any real answers to. So I hope someone can help me here.

I currently have server1 running two MySQL databases in separated docker containers. These should now be replicated on a second server. In case that server1 fails, I can switch to server2 relatively quickly via a ClusterIP.

In case it is important to know: the software that uses the database is a sports competition management system which does a lot of write operations to the database (not tested but in total mot write that read operations).

The questions for me are now:

  • Which replication method is the most suitable?
  • As I understand it, MASTER <-> MASTER would be most appropriate. But I've also read here again and again that problems can arise.
  • With MASTER <-> SLAVE, the question arises that the slave can only read. What happens if the master fails? Does the slave then automatically become the master and can also write?
  • Or is a cluster the best solution? At the moment I only have one active node. Another DB node in the USA could be added in the future. But at the moment it doesn't exist.

I really appreciate any help, because I need a fast working solution and this general topic seems to be very huge and not that easy.

ua flag
Discussions about MySQL configuration are better addressed at dba.stackexchange.com .
Score:2
ua flag

You raise two issues.

MySQL Topology In order (from OK to Best)

  • Primary -> Replica -- A "failover" can be achieved, but it takes manual effort, hence time.
  • Primary <=> Primary -- This is only slightly more complex to set up, while providing "instant" use of the other server.
  • Cluster of at least 3 servers. This further automates failover. See "InnoDB CLuster" (MySQL 8) or "Galera" (included with MariaDB).

Geography -- Be aware that even a data center can fail. For example, how much of, say Florida, can be taken offline by a single hurricane?

Be aware of the "split-brain" scenario. This is where you have only two servers and both are alive and well, but the network is down. They can't tell and you can't tell what the situation is. If each thinks that it is the only server alive and continues to take writes, you will end up with a mess. So, instead, you have to assume the entire system is down.

Bottom line -- You need at least 3 servers, physically separated.

Proxy

There is still the problem of clients knowing what part of the database system is alive (for reading and/or writing). When only "reading" is important, many topologies with any number of Replicas suffices -- and provide 'unlimited' scaling. "Writes" are where the real challenges are.

There are several 3rd party products that are good at noticing that one server is down and "doing the right thing" of rerouting to some other server. Research them.

Coding

When a failure occurs, your code is likely to get some kind of error. You must check for errors, some are not self-repairing. And most network errors take some time to notice.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.