Score:1

Why is my zfs backup so small?

de flag

While asking this question, I have figured it out and will be self-answering.

I have backed up my data (tank) to a file-based zfs pool on an NTFS drive (indoorpool). However, it now looks like the backup is taking up a lot less space than the original. Why is this? Was my backup done fully and successfully?

$ zfs get compression,compressratio,used tank indoorpool
NAME        PROPERTY       VALUE     SOURCE
indoorpool  compression    off       default
indoorpool  compressratio  1.53x     -
indoorpool  used           261G      -
tank        compression    lz4       local
tank        compressratio  1.32x     -
tank        used           457G      -

I find it odd that

  • Compression is disabled on the backup but enabled on the original
  • Despite that, there is a high compressratio
  • The backup is taking up 196G less space

Backup Creation

# create file of size 1Terabyte
sudo truncate -s 1T /media/generic/My\ Passport/indoorpool_20230206.zpool

# create zfs pool inside that file
sudo zpool create indoorpool /media/generic/My\ Passport/indoorpool_20230206.zpool 

# send data from my original pool to the backup
# -R means recurse into lower datasets
# -w means to send raw (the original is encrypted)
sudo zfs send -Rw tank/ds1@backup2302062136 | sudo zfs receive -u -d indoorpool

I let this run overnight and it did not report any errors.

No Snaps are Missing

Comparing the outputs of

zfs list -t snap -s creation -r tank/ds1
zfs list -t snap -s creation -r indoorpool/ds1

does not show a difference. Same number of lines. ( 5996, because docker likes to create zfs datasets >.< )

Score:0
de flag

I missed that I had some datasets outside tank/ds1.

How to List all Datasets

zfs list -r tank

Use sed to get rid of the clutter that the docker datasets add in the list command's output:

zfs list | sed -E '\#/u18/.{12}#,+1d#'

This gives a good overview over all datasets and their mountpoints, that I had manually created but deletes every occurrence of /u18/asdfasdfasdfasdf and the following line, because that is where my six thousand autogenerated datasets reside.

My Solution

I had a dataset tank/nosnap of size 186G and a tank/var dataset of a few kilobytes that I completely forgot about.

Removing 186G from the original size of the zpool list output

zpool list
NAME         SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
indoorpool  1016G   261G   755G        -         -     0%    25%  1.00x    ONLINE  -
tank         572G   457G   115G        -         -    54%    79%  1.00x    ONLINE  -

leaves me with 271G, which does correspond to the USED value of the backup (indoorpool).

The only thing that I still don't get is why there is a compressratio when the compression is reported as off (or why it is disabled at all).

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.