Prefetched Small File Cache

lte678

6/5/24, 2:39 PM

First, my use case: On my Linux-based server I am getting unsatisfactory disk IO performance for small files, and am limited to the approximately 100 IOPS that a 7200rpm HDD will support. This is of course expected, and I am seeking a way to improve performance. It is especially problematic since I am working with code bases including 10,000s of source files and objects. The total amount of data is not economical for SSDs. Separating the large files (that take up the majority of storage) and small files is not possible.

The typical solution would be to use a cache system like lvmcache, but the way I understood it, in the standard configuration it would only provide a performance benefit for frequently used files (please correct me if I'm wrong!). This does not fit my use case. The files are accessed quite randomly and rarely.

Thus the question: Is it possible to configure a cache to prefetch small files and does this make sense? They only make up a small percentage of the total storage utilization and would fit completely on an SSD. I would like them to live there permanently for on-demand access. I see no inherent technical issue, but I was unable to find any such documented behavior, except for some supercomputer data storage systems ^^

2 + 0

linux

performance

hard-drive

cache

Score:1

Server

John Mahowald

6/5/24, 5:01 PM

Quantify what acceptable performance would be. Perhaps downloading all of a small sized project takes no more than one or two seconds. Having performance objectives defined by user experience makes for well defined goals.

Review how the files are being stored. Tens of thousands of files approaches the worst case, with lots of IOs for files and metadata. Databases or archives would be better, packaging up into larger bundles with fewer IOs. Version control systems and archive tars in other words, especially when dealing with code over time.

This being Linux, developers love to reinvent the wheel. So there are many block cache implementations, the most maintained are possibly lvmcache and bcache. At least, both of those are mainline kernels, resulting in comparison tests like this. Although looks like RHEL is not ready to support bcache.

It is not possible to make a hybrid block device as fast or easy to use as an all flash setup. There will be cache misses. There will be failures of the cache device, at which point you better know if it is in writethrough or writeback mode, and if recovery involves data loss. Those are the tradeoffs for a less expensive storage overall.

These being block devices they are a level below a file system, and unaware of small files. However, depending on how deep you want to get into tuning, they may be able to detect sequential block I/Os. Which may be an acceptable proxy, depending on how fragmented files are.

A distro with good storage documentation will cover lvmcache. Here is lvmcache examples from RHEL 9. You would want type cache, just writes via writecache will not be a sufficient boost.

Beware, the underlying dm-cache tunables mention "sequential_threshold" but this does not have an effect. Modern kernels replaced the cache replacement policy with a faster one, but without the knobs.

Just block caches do not have a prefetch mechanism, especially not for a targeted subset of files. Again, block layer, doesn't know about files. Something would need to do I/O for them to know that something is hot. Digging through Server Fault archives, some have pre-warmed caches by reading files.

Note that RAM is still faster than solid state, and Linux is always maintaining a file cache. More RAM would increase the working set of this cache, although note that it would need to be slow at first, until the hit ratio improved. I recommend investing in all flash storage before throwing excess RAM at this problem, however.

+ 1

lte678

6/6/24, 9:27 AM

Thank you very much for the extensive answer! I will accept it as a solution, but for anyone else coming along, I would like to note that after further digging, another solution that matches this use case very well are ZFS Metadata Special Devices (https://forum.level1techs.com/t/zfs-metadata-special-device-z/159954) which allow "sorting" of files and their metadata into volumes according to size. I will have to see which works better.

Score:0

Server

bjoster

6/16/24, 2:40 PM

Try to pin them into vcache with vmtouch (https://hoytech.com/vmtouch/). Given enough RAM, it will speed up access times to your files.

Also, think about a SSD - prices had been dropping lately.

+ 0

Elon Musk

I sit in a Tesla and translated this thread with Ai:

EN: Prefetched Small File Cache

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.