I'm wanting to share data among multiple AWS instances in a high-performance, low-latency manner. Giving all instances read-only access (except one instance that would handle writes) is fine. Two points about this use-case:
- Nodes attached to the volume might come and go at any time (start, stop, be terminated, etc).
- Shared data includes 1000s of potentially small files that need to be listed and have metadata checked.
So I initially tried EFS, but it is rather slow for operations that need to enumerate or modify 100s or 1000s of small files.
So now I'm considering EBS multi-attach. However, to prevent data corruption AWS recommends a using only clustered filesystem like GFS2 or OCFS. Both of those appear to be complex and finicky to configure, as well as fragile for the use-case of a cluster where nodes might come and go at any time. For example, GFS2 requires cluster software on all nodes to be restarted if the number of nodes goes from more than 2 to exactly 2; or, adding a new node involves logging in to a current node, running some commands, and possibly re-distributing an updated config file to all other nodes. It just seems really inflexible as well as a lot of extra overhead.
But if I was sure only 1 instance would be doing the writing to the disk (or possibly each instance could only write to its own subfolder or even disk partition), could I use a regular filesystem like XFS for this volume and get away with it? Or would there be subtle data corruption issues even if access is technically read-only or write-access is restricted to instance-specific subfolders or partitions?
Or is there a completely different solution I'm missing?