Score:0

Linux file system(s) with ACID guarantees?

cn flag

What could I do, to obtain a Linux file system, with the same guarantees as ACID guarantees for databases?

It seems file systems haven’t advanced to this point that I had assumed would be industry standard for everything by now.
At least I could not find any search results for it.

The best I could find is that they manage to keep their structural data intact. But not the actual file contents. (Though I do not know enough about ZFS, let alone Lustre.)

Am I approaching this wrong? Is the file system the wrong layer? (Maybe the block layer is the right one?)

I don’t exactly want to run ext4 on top of PostgreSQL via FUSE, or some other impossible Lovecraftian abomination like that. ;)

EDIT: As an administrator, I plan on setting up a high-reliability business server (for files, not databases) with it. Obviously a file system with ACID guarantees is what a business wants on its servers in that case. I can’t think of any StackExchange site more appropriate for this than serverfault.

John Mahowald avatar
cn flag
Which database has the properties you want for your application? And with what configuration? Examples exist for databases that relax any of the ACID properties, for various reasons.
datenheim avatar
et flag
How would you translate ACID to a file system? What is your use case?
Score:2
cg flag

I'm pretty sure you are approaching this wrong, unless you have very peculiar requirements. Some academic work has been done around ACID filesystems (see this paper), including the Amino FS (using Berkeley DB as a store), but:

Journaling filesystems are transactional and so include at least sort of include atomicity; isolation isn't really a problem at the FS level because file locks are (usually) enforced by the kernel (Windows) or application code (Unix-like, see the discussion here for some details); and durability and consistency are goals of any filesystem, though all filesystems have some tradeoffs (ZFS happens to strongly emphasize data integrity even through hardware problems, though).

I've never written an FS myself so parts of this paragraph are conjecture. The bigger problem with your approach is that being fully ACID is usually handled at the DBMS or application level, and I suspect that's for a good reason, because it would be awfully inefficient to provide, for example, complete isolation of sequential transactions on files at the FS level. In the paper in the first link their benchmarks show that their prototype FS is significantly slower with a higher overhead for just about every operation compared to ext3.

Evi1M4chine avatar
cn flag
Yes, I may be approaching this wrong. Your comment may not be a solution per se, but it is very useful in getting some perspective. … The thing is: I don’t care about speed *at all* here. Speed always has led to cutting corners that shouldn’t be cut. My goal is absolute, uncompromising reliability. But for my *files*. (Actually I’m planning to go way beyond ACID, targeting CAP and concepts implemented in the GHC compiler, which would solve speed problems too, but let’s not go there until ACID itself is at least answered :)
Evi1M4chine avatar
cn flag
Journaling file systems only journal their metadata, AFAIK. Not the actual file data. (If I’m wrong, do you have an example of a FS that does? As that would solve it mostly.) … But that still won’t give me some e.g. “commit” and `rollback` to call after having written to files and changed directory structures or other metadata.
Zac Anger avatar
cg flag
@Evi1M4chine ACID + CA(P?) + ideas from Haskell sounds like an _extremely_ safe FS. I wouldn't use it, but I would be interested to see how it turns out. Journaling can be either metadata-only or block-level. Reiser and ext4 are examples of the latter (but have their own issues). Lustre is the only currently maintained project I know of offhand that has both physical journaling and a change log, but the Wikipedia page comparing FS capabilities might be handy. Of course there's also COW (like in btrfs) which is a different solution to the journaling problem.
Zac Anger avatar
cg flag
@Evi1M4chine I don't know too much about btrfs besides what I saw at a few conference talks, but its snapshots and log tree design might be worth investigating. I don't think it meets all your requirements but if you wind up building something yourself, it could definitely be useful.
Evi1M4chine avatar
cn flag
I tried btrfs extensively, and had to conclude that it is an un-salvageable train wreck. It was fundamentally misdesigned. E.g. you couldn’t even find out the amount of space a subvolume takes up, without walking its entire tree! Because it was essentially just multiple directory trees in a single storage area. After that, I stopped using it. … Maybe they have fixed that now. But unless they rewrote it from scratch, the result would still be a Windows ME: Crutches piled on crutches built on top of the wrong skeleton. ;) … The best choice I know of right now, is ZFS.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.