Score:1

How to explain that ZFS' snapshot reports 'WRITTEN' as 4.26GB, but the transfer size is actually 31.4GB?

in flag

When I run this query on a dataset

zfs list -d 1 -t all -o name,used,refer,written,compressratio sfg-backup/mx

I see the following stats:

zfs list -d 1 -t all -o name,used,refer,written,compressratio sfg-backup/mx
NAME                                           USED     REFER  WRITTEN  RATIO
sfg-backup/mx                                  300G      276G        0  1.80x
sfg-backup/mx@madcow_2023-04-15_23:15:00_UTC  4.04G      275G     275G  1.28x
...
sfg-backup/mx@madcow_2023-04-21_01:15:00_UTC     0B      276G        0  1.28x
sfg-backup/mx@madcow_2023-04-21_02:15:00_UTC     0B      276G    4.26G  1.28x
sfg-backup/mx@madcow_2023-04-21_03:15:00_UTC     0B      276G        0  1.28x

However, when I run a backup, that has the last snapshot as madcow_2023-04-21_01:15:00_UTC the size of the backup is not 4.26GB but 31.4GB

syncoid --no-sync-snap 10.0.1.2:sfg-backup/mx work/sfg/mx
NEWEST SNAPSHOT: madcow_2023-04-21_03:15:00_UTC
Sending incremental sfg-backup/mx@madcow_2023-04-21_01:15:00_UTC ... madcow_2023-04-21_03:15:00_UTC (~ 31.4 GB):
31.5GiB 0:03:16 [ 163MiB/s] [==================================================================================================>] 100%

adding -c for compression brings the size to 4.3G (these are slighly different snapshots, but with more of less the same content.

zfs send -nv -c -I sfg-backup/mx@madcow\_2023-04-24\_00:15:00\_UTC sfg-backup/mx@madcow\_2023-04-24\_03:15:00\_UTC
send from @madcow_2023-04-24_00:15:00_UTC to sfg-backup/mx@madcow_2023-04-24_01:15:00_UTC estimated size is 215M
send from @madcow_2023-04-24_01:15:00_UTC to sfg-backup/mx@madcow_2023-04-24_02:15:00_UTC estimated size is 4.09G
send from @madcow_2023-04-24_02:15:00_UTC to sfg-backup/mx@madcow_2023-04-24_03:15:00_UTC estimated size is 624B
total estimated size is 4.30G

# without -c flag:
zfs send -nv  -I sfg-backup/mx@madcow\_2023-04-24\_00:15:00\_UTC sfg-backup/mx@madcow\_2023-04-24\_03:15:00\_UTC
send from @madcow_2023-04-24_00:15:00_UTC to sfg-backup/mx@madcow_2023-04-24_01:15:00_UTC estimated size is 216M
send from @madcow_2023-04-24_01:15:00_UTC to sfg-backup/mx@madcow_2023-04-24_02:15:00_UTC estimated size is 31.3G
send from @madcow_2023-04-24_02:15:00_UTC to sfg-backup/mx@madcow_2023-04-24_03:15:00_UTC estimated size is 624B
total estimated size is 31.5G

Can you help me understand what can cause this large discrepancy in sizes? Why reported by ZFS compression is 1.28 and transfer compression is 31.5/4.3=7.3?

ewwhite avatar
ng flag
`zfs get compressratio sfg-backup/mx`
dimus avatar
in flag
@ewwhite, good point, I added compress ratio to the question
shodanshok avatar
ca flag
Please show `compressratio` for snapshots from `madcow_2023-04-21_01:15:00_UTC` to `madcow_2023-04-21_03:15:00_UTC`
dimus avatar
in flag
I added compressratio to the snapshots @shodanshok
shodanshok avatar
ca flag
Can you show the output of `zfs send -nv -I sfg-backup/mx@madcow_2023-04-21_01:15:00_UTC sfg-backup/mx@madcow_2023-04-21_03:15:00_UTC`
djdomi avatar
za flag
don't post responses as comment, edit the question instead please
Score:2
ca flag

WRITTEN shows compressed data actually written to the dataset/snapshot. Between sfg-backup/mx@madcow_2023-04-21_01:15:00_UTC and madcow_2023-04-21_03:15:00_UTC you wrote highly compressible data on top of previous incompressible data, without de-referencing the entire file.

I suppose you have some big file which can be randomly overwritten (ie: virtual disk image files, databases, etc), and it just happened that you wrote 32G of raw data which became 4G of compressed data.

zfs send -c sends the compressed records as they are, transferring only the compressed 4G delta. On the other hand, zfs send (without -c) uncompresses the on-disk data, expanding them to the full 32G size.

dimus avatar
in flag
thank you for the reply @shodankshok. So how to explain the difference in compression reported by ZFS (1.28x) and the difference in sizes (7.3x) between compressed `send` and uncompressed `send`?
shodanshok avatar
ca flag
1.28x is the `compressratio` of the *entire referenced data (276G)*. 1.28x276=353, so there is ample margin to expand your 4.3G into >30G
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.