Score:3

Why does my SLOG slow down sequential sync writes?

cn flag

I jury-rigged a TrueNAS server to run a few tests in preparation for a proper NAS build. I got hold of an older Xeon server with 256GB RAM and a 10GbE NIC, equipped it with 6 sata disks configured as 3 mirror vdevs and installed the latest TrueNAS Core version. I also added a single 850 Pro SSD as SLOG (I know - it's just for a rough performance test).

The dataset is mounted via NFS (with default settings, i.e. sync enabled) and bonnie++ was used to test read and write performance. Results were roughly as expected but as soon as I removed the SLOG from the pool the block sequential write performance increased by a factor of five (480M/s vs 107M/s). Everything else remained equal.

I don't understand how ZFS arranges the ZIL across pool-disks when no SLOG is present but even if it can use the throughput of all six spinning disks I would expect it to be slower than the nominal 500MB/s provided by the SSD.

I found an issue on GitHub that seems to explain such behaviour but it was resolved some time ago. Is there another explanation?

Edit (answering questions in comments and further tests):

sync is set to standard, I left all params at their default values (except atime=off, compression=on).

The dd and fio tests as suggested by shodanshok gave consistent results: ~570kB/s with SLOG vs ~380kB/s without, as one would expect.

I ran a few more fio workloads and will show the SLOG / NOSLOG speedup for each them below. Each of those was without oflag=sync, but due to the NFS sync mount there should still be syncs at the end of each file or loop cycle:

  • --rw=write --loop=1000 --size=8k: 1.67
  • --rw=write --loop=10 --size=32M: 0.44
  • --rw=write --loop=1 --size=1G: 1.07

I did not do any further bonnie++ testing since there's an additional complication I did not mention initially: bonnie++ allocates a file twice as large as system memory, to reduce caching effects. Since my test machine has huge amounts of RAM I ran the test in a memory limited cgroup (memory.limit_in_bytes=4G). There seems to be some interaction going on between the memory limit, the large file write and the SLOG/ZIL which I don't understand.

shodanshok avatar
ca flag
Can you try with `dd` and `fio`? ie: something as `dd if=/dev/urandom of=/your/path/test.img bs=4k count=8192 oflag=sync` and `fio --name=test --filename=/your/path/test.img --rw=randwrite --fsync=1 size=32M`
ewwhite avatar
ng flag
Are you testing a realistic or representative workload? Also, what are your ZFS filesystem settings; specifically the `sync` parameter?
Score:1
ca flag

Large sequential sync writes can be efficiently served by your main pool disks (and you have 6 of them vs single SSD), while SLOG is mainly useful for small random sync writes. Moreover, the SSD you are using as SLOG is not particularly fast at doing sync writes (it has no power loss protected DRAM cache).

Your bonnie++ results seems skewed by client or server side caching, so I would discard them. As best practice, always evaluate your system performance with some real world workloads.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.