Score:4

Missing ~900GB of hard hard drive space

ae flag

I have a 2TB (1.8T) WD M.2 drive that appears to be missing several hundred GB's of storage space but I cannot find out where it is or what is taking it up. This also was happening on my Samsung SATA SSD so I don't think it has anything to do with the drive itself. This is the only partition on either of these drives.

df says that I am using 1.3T of data

trever@server:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev             32G     0   32G   0% /dev
tmpfs           6.3G  5.7M  6.3G   1% /run
/dev/nvme0n1p2  1.8T  1.3T  477G  73% /

Disk Usage Analyzer (baobab) says I am using ~850GB of space, which to me seems way more accurate for what I expect. I ran this as root (sudo baobab) and had it scan the root drive / and this is what I came back with

enter image description here

And then system monitor also says I am using 1.4T, which is fine I get there could be some rounding and /or how disk space is calculated.

The 800-900GB of storage usage makes more sense to me, I checked things like the reserved space:

trever@server:~$ sudo tune2fs -l /dev/sda1 | grep -i "block count"
[sudo] password for trever: 
Block count:              976754176
Reserved block count:     48837708

I have also checked the size of /var/logs (20GB) and /var/cache/apt/archives/ (380MB) and I am still not figuring out where hundreds of GB are missing to.

Any more suggestions on what could be taking this space up?

Update: More and more space seems to go missing over time. Here is where I am now:

trever@server:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev             32G     0   32G   0% /dev
tmpfs           6.3G  5.3M  6.3G   1% /run
/dev/nvme0n1p2  1.8T  1.6T  157G  92% /

And here is what I can account for:

root@server:~# du -cha --max-depth=1 --exclude=/Volumes/* / | grep -E "M|G"
42M /scripts
du: cannot access '/proc/44457': No such file or directory
du: cannot access '/proc/44535': No such file or directory
du: cannot access '/proc/44586/task/44586/fd/4': No such file or directory
du: cannot access '/proc/44586/task/44586/fdinfo/4': No such file or directory
du: cannot access '/proc/44586/fd/3': No such file or directory
du: cannot access '/proc/44586/fdinfo/3': No such file or directory
94G /var
du: cannot access '/run/user/1000/gvfs': Permission denied
5.3M    /run
2.1G    /swapfile
183M    /boot
14M /etc
538G    /docker
202G    /home
8.9G    /usr
6.0G    /snap
1.8G    /root
852G    /
852G    total

I have done a fsck and it seems fine. I am not sure what is magically eating up my space but its quite concerning.

in flag
What file system are you using on your SSD? If it’s ZFS, you may have a bunch of space tied up in snapshots
trever avatar
ae flag
Sorry, should have mentioned that, im using EXT4.
in flag
I wonder how many “tiny files” you have, as this may be leaving a large number of blocks mostly empty. The difference between “file size” and “space a file uses” is generally not an issue people see until they start working with multi-TB storage devices
za flag
Too long ago to remember exact details, I had similar problems with pre-20.04 releases, thrice. Once was a partition issue. Once was a weird file. Once was neither, and I could find no solution, nobody to help. I was left with no option but to perform a fresh, manual install. The fresh install needed about 20 minutes, and solved all issues. lol, expedience has value.
oldfred avatar
cn flag
Have you emptied trash? Housecleaned? Some tools show totals with and others without. I prefer command line tools like df, du (or ncdu) & lsblk and then gparted if using GUI. How did /var get ot 85GB? My /var in Kubuntu 20.04 is 800MB.
cc flag
Check what's under any mount points too. See https://unix.stackexchange.com/questions/4426/access-to-original-contents-of-mount-point
nobody avatar
gh flag
`findmnt` please.
heynnema avatar
ru flag
Does the drive have a MBR (dos) or GPT partition table? See `sudo fdisk -l`. Report back. Start comments to me with @heynnema or I'll miss them.
trever avatar
ae flag
@oldfred - yes I have emptied the trash and everything I could find. `/var` is large because of Docker it appears.
trever avatar
ae flag
@nobody - https://gist.github.com/Treverr/79cfc04baac54db0a3dd1e97b1b03fe7
trever avatar
ae flag
@heynnema - GPT (https://gist.github.com/Treverr/74b0bebf551e3b0cd187823fc08a995b)
Robert Riedl avatar
us flag
do you use sparse files ?
Satoshi Nakamoto avatar
lc flag
I don't think is anything missing at all with `df -h` it says that there is 477G AVAILABLE. Are you referring to unusual space occupying the SATA, maybe it's an issue between the manufacturer and Ubuntu's capability of handling this driver. How was it before?
trever avatar
ae flag
@RobertRiedl I do not on this drive
trever avatar
ae flag
@Tyþë-Ø - What im seeing is that the USED space minus the capacity of the drive (1.8T) does not come close to what the available is reporting. Its off by nearly 500GB. I am seeing the same thing across 2 different drives when I moved from a SATA to NVMe M.2 drive.
Satoshi Nakamoto avatar
lc flag
Sure so you made a mistake. There's two sort of data types, MiB (megabibytes) and MB(megabytes), your driver never will be exactly 1.8TB because of that conversion.
Satoshi Nakamoto avatar
lc flag
Besides I see that /udev and tmpfs occupies 38GB and sum up 477GB + 38GB = 515GB. I believe you got your answer
ExploitFate avatar
zm flag
Could you run `fsck` and add output to the question?
trever avatar
ae flag
@ExploitFate I have ran `fsck` recently and the file system is in good shape. I can run it again. But now `df` is showing that I only have 25GB available and I still cannot figure out why
trever avatar
ae flag
Anyone have any other ideas? Nothing suggest this far has done anything and I now have almost 1TB of missing space. Something is seriously wrong.
trever avatar
ae flag
@SatoshiNakamoto not sure I understand what you mean. For example now `du` adds up to 852G and `df` says 1.6T has been used, thats double but I still cant find out where the 800GB is.
Robert Riedl avatar
us flag
I still think this is somehow related to docker how it handles files..
trever avatar
ae flag
Any idea how I could test this theory / confirm it? Stop all my docker services?
Score:2
in flag

Hypothesis: deleted but still opened files

Given the information given so far, and hinted at the presence of a high volume of docker-related files, I would suspect this is caused by deleted-but-still-opened files, that is files that are created by a program, then the program deletes the filesystem path while keeping the file descriptor still opened.

This may be the result of Docker activity on the filesystem.

Principle of solution

  • Reveal the deleted files that are still referenced by processes (see below for a way).
  • Ideally reveal process names and/or ID so that you have hints about what is happening.
  • Close those processes, observe that space is freed.
  • If all else fails, just reboot the machine. If space is freed, this is consistent with the hypothesis.

How to reveal the information

The commands below will test the hypothesis by finding and displaying what space is used by what file.

First glimpse

For a basic glimpse, you can issue this:

lsof -n | egrep -w "deleted|^COMMAND"

But this will also list a lot in-memory-only pseudo-files that don't take up any actual storage space.

Example:

COMMAND       PID     TID TASKCMD              USER   FD      TYPE             DEVICE   SIZE/OFF       NODE NAME
Xorg         1183                              root   78u      REG                0,1          4       2058 /memfd:xshmfence (deleted)
Xorg         1183                              root   85u      REG                0,1          4       7182 /memfd:xshmfence (deleted)
Xorg         1183                              root   92u      REG                0,1          4       7137 /memfd:xshmfence (deleted)
Xorg         1183                              root   94u      REG                0,1          4       7870 /memfd:xshmfence (deleted)

Filtered simple list

This filters and mostly shows real files:

lsof -F "sn" -lnPX -M | sed -n 's|^n/|/|p' | grep deleted | egrep -v '^/(dev/shm|memfd:|proc)' | LC_ALL=C sort -n | uniq

Example:

/tmp/#someinodenumber (deleted)

Complete information, with size, process and task name

This is more interesting: it will list all files along with the space they occupy in bytes and more.

First, slow part, gather data

# You may want to run this part as root to make sure all is reported
lsof -F "ctsupMin" -lnPX -M >|/tmp/lfosoutput 

Then process and format for a nice display, complete and sorted by increasing size

# Can be run as regular user, no need for root
{ echo "SIZE^UID^PID^PROCESS NAME^TASK NAME^INODE^PATH"
</tmp/lfosoutput \
python3 -c $'import sys ; f={}
def g(c): return f.get(c,"(unknown)")
for line in sys.stdin:
 c=line[0] ; r=line[1:].rstrip() ; f[c]=r
 if c=="n" and f["t"]=="REG" \
    and "(deleted)" in f["n"] \
    and not f["n"].startswith("/memfd:") \
    and not f["n"].startswith("/dev/shm") :
  print(f'\''{g("s")}^{g("u")}^{g("p")}^\"{g("c")}\"^\"{g("M")}\"^{g("i")}^{g("n")}'\'')
  f={}' \
| LC_ALL=C sort -n | uniq
echo "SIZE^UID^PID^PROCESS NAME^TASK NAME^INODE^PATH"
} | column -t -s '^'

Sample output: a file of 36 megabytes used by Firefox

SIZE       UID        PID        PROCESS NAME       TASK NAME          INODE     PATH
36012032   1234       12345      "Isolated Web Co"  "StyleThread#2"    1234567   /tmp/mozilla-temp-12345 (deleted)
SIZE       UID        PID        PROCESS NAME       TASK NAME          INODE     PATH

(Actually there are many lines like these, this is only a sample line.)

Testing if the script indeed reveals such files by creating one

In another terminal, copy-paste this:

# Run python interactive interpreter
python3
# Now in Python
n="/tmp/whatever_file_name_you_want"
f=open(n,mode='a')
import os
os.unlink(n)
f.write("some sentence")
f.flush()
# Don't exit now or the file will really disappear

In the first terminal you can run both steps above (the slow lsof then the formatting part). And as long as the python process above is alive, this line is reported:

SIZE  UID   PID      PROCESS NAME  TASK NAME  INODE    PATH
13    1000  1387343  "python3"     "gdbus"    1308894  /tmp/whatever_file_name_you_want (deleted)
SIZE  UID   PID      PROCESS NAME  TASK NAME  INODE    PATH

You can then exit the python interpreter above (press Control-D or type exit(0)). If you run both parts (the slow lsof then the formatting part) you will observe that the test file no longer appears.

The script above can be modified to write huge amounts of data (like hundreds of gigabytes) and using your usual tools you'll see that the space is indeed freed only after the creating process has closed the file descriptor. Ending the process is enough to ensure the file descriptor is closed.

Back to your case

Running this, you will most probably see process names, task names and files. Either a few big files like images that Docker fetched from network, or a huge number of small files, from Docker again.

Or something else.

Please tell if this helps you.

Robert Riedl avatar
us flag
wouldn't a simple reboot also get rid of open files that are actually deleted ? or does this behave different with docker ?
mike mcleod avatar
cn flag
I get: **lsof: WARNING: can't stat() nsfs file system /run/docker/netns/f91a371e0e9d** so this doesn't really give the answer. But looks like a docker issue; try the docker web site. Also, I have files in this state. There must be a way from Ubuntu guys to clear these files?
mike mcleod avatar
cn flag
I found **/tmp/.org.chromium.Chromium.XXXXX** but when I closed chrome it disappeared! Beware!
trever avatar
ae flag
Thank you for all the information! So I ran the `lsof` and while I got a bunch of `can't stat() nsfs file system` I only got 2 other actual files: `/home/trever/.local/share/gvfs-metadata/root (deleted)` and `/home/trever/.local/share/gvfs-metadata/root-5f2ee275.log (deleted)` both which are small. Would you expect there to be many more or do you think the docker issue is the culprit here?
in flag
`gvfs-metadata` is unrelated. The messages `can't stat() nsfs file system` indeed mean that you are missing information. Did you do the `lsof` as root?
trever avatar
ae flag
Ah I did not. I re-did it all and here is what I got: https://gist.github.com/Treverr/98137ab1cdc754dcef1b7d6dad6c937c
trever avatar
ae flag
@StéphaneGourichon Do you any other ideas? I now have almost 1TB of missing space. Something is eating up "space" like crazy
in flag
Based on the gist you published, the size of deleted files is quite small, so I would say that hypothesis is not confirmed. Can you afford to stop and start Docker services and see if it reclaims the space? To reboot the machine and see if it reclaims the space? What filesystem is this (e.g. cat `/proc/mounts`) -- cf. matigo's comment it may be another cause.
trever avatar
ae flag
@StéphaneGourichon- Thanks for getting back to me! So I did try that, I stopped and disabled the docker system and rebooted and it did not give me any space back. I even went as far to uninstall Docker entirely and reboot, same thing. Here is what I got for `cat /proc/mounts` Thanks again! https://gist.github.com/Treverr/a250ebe9041b2939c873b1c167749b4f
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.