Score:0

Docker swarm HA - restarting manager node while original node is unavailable

cn flag

I've got a network of five docker managers, 1 through 5. (IP addresses x.x.x.101 through 105) The managers all joined with docker swarm join --token xxxx x.x.x.101:2377 As such, I expect to be able to bring two random nodes down, one up again, and have it work. But...

If I bring manager 1 down, then manager 2 down and up again, manager 2 won't come up. My swarm will keep running on manager 3 through 5, but manager 2 will complain as such:

warning msg="grpc: addrConn.createTransport failed to connect to {x.x.x.101:2377  <nil> 0 <nil>}. Err :connection error: desc = \"transport: Error while dialing dial tcp 192.168.8.199:2377: connect: connection refused\". Reconnecting..." module=grpc

Why doesn't node 2 attempt to connect to nodes 3 through 5? How can I make it do so?

Also, how should I recover from a situation like this? The documentation seems to be lacking any information on this subject. What would I even call the "original main master"?

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.