What I want to achieve is get the information about how big are the top-level folders in a directory (which is a NTFS volume) on a CentOS7 server. This information is placed in a Prometheus file, which is used in order to send this information into a Grafana dashboard.
The script executed via a cron job every day looks like this:
#!/usr/bin/env bash
# Generate Prometheus collection metrics about Jenkins projects disk usage on the system.
# Currently, only top-level folder information is collected in bytes.
# Truncate last text file entry, so it can be prepared for entering the newest data
echo -n > /var/lib/node_exporter/textfile_collector/jenkins_projects_disk_size.prom
cd /jenkins/jobs # Go into the NTFS directory, containing all jobs data
for f in *; do # Go through each top-level folder
if [ -d "$f" ]; then
# Will not run if no directories are available
prometheus_entry=$(du ${f} --block-size=1 --summarize "$@" | \ # execute a `du` command for each top-level folder, so that it's size can be calculated, and save the output in the Prometheus format
sed -ne 's/\\/\\\\/;s/"/\\"/g;s/^\([0-9]\+\)\t\(.*\)$/jenkins_directory_size_bytes{directory="\2"} \1/p')
echo $prometheus_entry >> /var/lib/node_exporter/textfile_collector/jenkins_projects_disk_size.prom
fi
done
This currently works a on a few servers, which do not have a massive directory size compared to the server with issues (500GB-1.5T), and also works relatively fast.
The current problem I am having is that on the problematic server in particular the size of the folder is quite big (50T). Of course, as it can be expected with that size, the du/df commands are very slow (I think I will need more than 15-20 hours to execute the script).
Is there a way to further optimize this process, or use some sort of cache or any other alternative way (e.g. with a different tool)? I already tried ncdu
, but it is a GUI and I cannot extract the information the same way I want for Prometheus to work.
As previously mentioned, I only need top-level folder size information and nothing else. Any help or advice will be greatly appreciated! Thanks in advance.