Score:1

zfs zpool dedup stat seems very wrong - how to interpret these values?

es flag

I run a fileserver receiving backups of user containers. Two containers are poorly run docker systems with hundreds of near-identical directories not using overlayfs nor zfs clones. (I cannot touch the user's containers to remedy their use nor seem to educate them.)

I thought as a test to just copy the backup target zfs fs's on my fileserver to newly created fs's in the same pool but with dedup=on. The fs's are ~540GB and 630GB. Dedup is off for the pool and other fs's (compression is on).

I have done the copy, but now zpool list is showing a dedup of 2.93x on 25TB of data for the whole backup pool - when I've only copied about 1.15TB over for these two fs's.

Zpool list:

NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
d      30.0T  25.7T  4.27T        -         -    41%    85%  2.92x    ONLINE  -

The zdb -DD output from this whole pool is:

bucket              allocated                       referenced
______   ______________________________   ______________________________
refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
------   ------   -----   -----   -----   ------   -----   -----   -----
     1    3.75M    454G    249G    250G    3.75M    454G    249G    250G
     2     359K   41.0G   38.8G   39.2G     752K   84.9G   80.2G   81.1G
     4    50.1K   4.43G   3.79G   3.94G     242K   21.2G   18.2G   18.9G
     8    37.3K   2.98G   2.62G   2.77G     396K   31.3G   27.6G   29.2G
    16    21.6K   1.63G   1.31G   1.39G     475K   34.9G   27.5G   29.4G
    32    20.5K    918M    699M    854M    1.06M   46.8G   35.7G   44.0G
    64    20.3K   1.07G    851M    991M    1.71M   93.6G   71.4G   82.8G
   128    5.13K    341M    297M    327M     813K   55.3G   47.6G   51.8G
   256    14.7K    725M    493M    592M    6.56M    328G    224G    267G
   512    1.09K   7.44M   6.51M   21.3M     838K   4.80G   4.21G   15.4G
    1K      119    300K    232K   1.88M     188K    427M    347M   2.95G
    2K       22     65K     53K    352K    60.5K    172M    143M    968M
    4K        9     33K   32.5K    144K    49.1K    150M    148M    786M
    8K        3   2.50K   2.50K     48K    36.0K   29.3M   29.3M    576M
   16K        5   2.50K   2.50K     80K     131K   65.7M   65.7M   2.05G
   32K        1    512B    512B     16K    62.8K   31.4M   31.4M   1004M
   64K        1    512B    512B     16K    90.9K   45.4M   45.4M   1.42G
 Total    4.27M    507G    298G    301G    17.1M   1.13T    786G    880G

dedup = 2.93, compress = 1.47, copies = 1.12, dedup * compress / copies = 3.84

(One cannot run zdb -DD on a filesystem, just a whole pool.)

As for the used vs logicalused on the actual fs's, I see:

d/ntgor4  used                  398G                   -
d/ntgor4  logicalused           542G                   -
d/ntgor5  used                  528G                   -
d/ntgor5  logicalused           629G                   -

So not much savings at all, but the dedup value is quite high -- and the histogram suggests there is some large savings (esp. in the referenced # of blocks, esp. at 256 count).

Reading the histogram and checking PSIZE and DSIZE, Im assuming this only refers to those fs's with dedup - and not the whole pool -- and therefore zpool list's dedup factor does as well. (Unless it's including snapshots in the dedup?)

Can someone explain how to interpret this and why the dedup factor is so high?

shodanshok avatar
ca flag
Can you show the output of `zpool list`?
math avatar
es flag
added to original post. wondering if when you turn dedup on for any filesystem suddenly dedup for snapshots is counted as well. there are numerous snapshots for this pool's fs.
shodanshok avatar
ca flag
Ok, I would ask for `zfs list -o space` also. Thanks.
Score:0
es flag

The dedup figure in zpool list is only for fs's with DDTs (dedup tables) -- all other fs's and disk usage in the pool is ignored and not counted.

This is IMHO misleading. My 2.93x dedup only applies to ntgor4 and 5 fs's.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.