I'm looking for a distributed file system solution/network file system which can be used in the following scenario:
- I have a Docker Swarm with many hosts, but each host is essentially self-contained and looks no different than any other host. We just use them for scaling. That means each host runs all workers needed to make the system work, and ideally one task that enters the system runs completely on the host that first started it.
- There are several steps to processing a task. Each step generates some large file in the range of 1-10GB. Primarily, workers on each host will only work on files that are already stored locally.
- However sometimes, a host might be overburdened and I want workers in another host to take over the remaining processing steps. To achieve this, I need the files to be stored in a shared volume which the workers in other hosts can transparently use to access files stored on another host.
In other words: Each host will have the same "network volume" mounted in some place, and it contains some files which are actually stored on the current host (these files are primarily relevant), and some files which are stored on another host. However, workers will mostly (90-95% of times) access files which are local to their host.
I need no replication (files are only relevant for 30-60 mins and after that they're no longer needed anyways) and I specifically don't want a central storage.
Throughput in the system is measured rather in minutes per task and not in tasks per second. There are few large files rather than many small files. Also, files are written once and only read once or twice.
Is there a nice solution for this? I had a look at GlusterFS
, EdgeFS
, InterPlanetary File System
, Ceph
and some others, but none seemed like the right choice.