Score:1

Need a simple High Availability file share

nc flag

I am looking for the simplest way to share a single file in a high availability way between a pair of linux servers. (Version and distribution are unimportant, I'm looking for a generic solution.)

I have two servers, each with their own local disks and NFS shares and other services between them. I have a file that both servers need to access, but nothing other than those servers needs to access it.

If either server crashes, I want the most recent possible contents of that file to be available to the remaining server. (Obviously the other server should pick up changes on recovery.)

The file is a state file, and likely only one server at a time will be writing to it. The state file size is unknown, but small. Probably between 1 block and 2M. Possibly the size of the state file would grow depending on the length of downtime.

Without adding external hardware, what options are there for a high availability file share like this?

djdomi avatar
za flag
is there a restriction wich protocol should be used?
user10489 avatar
nc flag
The restriction is that I don't want to add hardware (like an iscsi nas), and that existing services can't be disrupted. For instance, it can use NFS, but I don't want to mangle NFS to the point where other clients in the cluster have trouble using NFS. I'm hoping to get a range of options I can choose from, as I know there are HA filesystems, but most of the ones I've looked at have pretty complex setup that are total overkill for this application.
djdomi avatar
za flag
could you add please the size of the file, if this is the only thing to be synced?
djdomi avatar
za flag
I've also intrested into this question, has i have a similar task in the future, i found intresting artikel on [StackExchange](https://unix.stackexchange.com/questions/307046/real-time-file-synchronization)
user10489 avatar
nc flag
@djdomi: that's interesting. It might work, but the application is sufficiently vague about how it uses the state file that I don't know if this would work. If it is doing locking or other IPC on the state file, this might break. If no better solution comes up, I'll have to experiment with that.
djdomi avatar
za flag
normally on linux no file will be locked ever unlike windows in my mind - what kind of file is that?
user10489 avatar
nc flag
Linux fully supports advisory file locking, just like windows. Just fewer applications use it.
djdomi avatar
za flag
i did not said it does not support it, its unlikly that its doing it ;) however found a second tool like [bsync](https://github.com/dooblem/bsync)
user10489 avatar
nc flag
bsync looks gross and overkill. And I agree it is unlikely to be using file locking, but I'd have to research that (or experiment) to find out.
djdomi avatar
za flag
aslong you dont come out with the truth about your file i can only give tips and try to find something in my glass ball ;)
user10489 avatar
nc flag
Taking a second look, lsyncd and bsync are very similar, I can't exclude these yet.
user10489 avatar
nc flag
@djdomi: I've looked at several rsync based solutions, and they actually solve a similar problem I have, but not this one. If you put that in as an answer with links to the various rsync variations, I'd at least give it a vote, but I think I really want a filesystem under this rather than a file synchronization service.
Score:1
cn flag

If the pair of servers are running hot-cold (i.e., only one of them accesses the file at a time), DRBD is a quick and stable way to accomplish your goal. DRBD is designed with split-brain protections in place, so it should be "good enough".

A brief blurb from the DRBD site:

The Distributed Replicated Block Device (DRBD) is a software-based, shared-nothing, replicated storage solution mirroring the content of block devices (hard disks, partitions, logical volumes etc.) between hosts.

DRBD mirrors data

  • in real time. Replication occurs continuously while applications modify the data on the device.
  • transparently. Applications need not be aware that the data is stored on multiple hosts.
  • synchronously or asynchronously. With synchronous mirroring, applications are notified of write completions after the writes have been carried out on all hosts. With asynchronous mirroring, applications are notified of write completions when the writes have completed locally, which usually is before they have propagated to the other hosts.

As this is a block-level replication, you would require a bit of extra configuration. E.g., you'd have to create a filesystem on top of the replicated device, and you'd need to mount that filesystem. The default recommended configuration only allows one host to mount the filesystem (to avoid split-brain situations), so you can only access the data on one node at a time.

The whole process is well documented and there are also some easy guides available.

If you are more into automation, Pacemaker + DRBD is a very common combination, it is even documented in the Pacemaker guides which is also a good intro to DRBD itself.

P.S. Funny how the pacemaker guide to DRBD I linked above almost perfectly describes your question.

Even if you’re serving up static websites, having to manually synchronize the contents of that website to all the machines in the cluster is not ideal. For dynamic websites, such as a wiki, it’s not even an option. Not everyone can afford network-attached storage, but somehow the data needs to be kept in sync.

Enter DRBD, which can be thought of as network-based RAID-1.

user10489 avatar
nc flag
DRDB was actually the first solution to this I looked at and rejected several months ago. On the surface, it looks really good, but when I started working through the complexities of actually setting it up, with the hot failover swapping NFS exports, and gross mangling of NFS to support the hot failover, I found it too disruptive of other NFS clients that didn't care about the hot failover status, plus other excessive complexities.
user10489 avatar
nc flag
PS: Voting for this anyway as it is generally looks like a good solution.
cn flag
I agree that doing NFS on top of DRBD can be exhausting (hard to stop access to the mountpoint to be able to bring down the DRBD device). But for your use case, you can use local mounts without any NFS. A filesystem on top of DRBD which you mount when needed.
user10489 avatar
nc flag
The filesystem is always needed, because the application will always have the file open, even when it isn't primary and not writing.
cn flag
Agreed. DRBD is only good for hot-cold setup, where "cold" implies that the application is not running. Pacemaker helps to automate some of these (i.e., mount the filesystem before bringing up the application). Not a solution for your use case it seems.
Score:1
cn flag

There are a lot of solutions - for one file I would probably use GlusterFS - but I think that you should have 3 servers for quorum or you will have to solve split-brains on recovery. You should be able to install it easily on every popular distro, and configure it not more than hour.

user10489 avatar
nc flag
I'm going to give the bonus to this without accepting it for now. I've looked at the other two suggested solutions and found them wanting, but I can't accept this one until I've tried it. If I try it and find it easy to set up and it works, I'll accept it.
Kszysiu avatar
cn flag
If you go for drbd solution you will need to set up dual primary configuration (third node in my opinion is still needed - else you can have splitbrain if network problem occur), and then some kind of clustered file system (if you want to mount it in 2 places at same time) - you cant use classic ext/xfs or any other filesystem. If you want to use classic filesystem, then you have to mount it only in one place at the time, and use nfs or something to mount it on second server. You need some free block device (or lvm) to use drbd. There are a lot of layers where something can fail.
user10489 avatar
nc flag
On testing, it turns out that my application actually doesn't want NFS at all for performance reasons (too much latency). GlusterFS gives a good balance between local performance and remote sync, with none of the delayed sync issues that an rsync type solution would give.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.