Score:2

How can I completely defragment ext4 filesystem

us flag

I want all my files in an ext4 filesystem not fragmented because of reasons. Sadly e4defrag (advised here: How to defrag an ext4 filesystem) fails to defragment several files. What are my alternatives?

The filesystem has all files in it already (they are not to be changed in any way) and it is almost full. There are several free blocks (according to df -h: 434M available of 85G and 80G used) which can be used as buffer. I do not need the filesystem mounted while defragmenting. Moreover, I have other filesystem available with enough space to use as a buffer.

One idea I have is to move the files to other filesystem and then copy them back somehow telling the filesystem to store them contiguously.

[EDIT]

I have just found that I cannot rely on e4defrag output. It counts files with more than one extent as fragmented, while it knows that the extents are contiguous:

$ sudo filefrag file.file
file.file: 1 extent found
$ sudo e4defrag -vc file.file
e4defrag 1.45.5 (07-Jan-2020)
<File>
[ext 1]: start 22388736: logical 0: len 32768
[ext 2]: start 22421504: logical 32768: len 32768
[ext 3]: start 22454272: logical 65536: len 32768
[ext 4]: start 22487040: logical 98304: len 27962

 Total/best extents 4/1
 Average size per extent 126266 KB
 Fragmentation score 0
 [0-30 no problem: 31-55 a little bit fragmented: 56- needs defrag]
 This file (file.file) does not need defragmentation.
 Done.
Mahler avatar
in flag
There is no need in defragmentation in ext4.
in flag
Is the purpose of this to shrink an `ext4` virtual disk image?
24601 avatar
in flag
Does this answer your question? [How to defrag an ext4 filesystem](https://askubuntu.com/questions/221079/how-to-defrag-an-ext4-filesystem)
abukaj avatar
us flag
@24601 No, I have found that some time ago. `e4defrag` fails and I cannot use `gparted` resize trick due to lack of space on the device.
abukaj avatar
us flag
@Mahler then what is your way to have **all** files contiguous?
Mahler avatar
in flag
I have ssd and defragmentation is bad. And for hdd, defragmentation is needed only if there are problems.
in flag
@Mahler The problem is that ‘problems’ are very much contextual. Defragmenting files on ext4 can in fact give measurably better performance for sequential or large reads, even on flash storage (making thousands of small requests is still more expensive than making one big one, even if you have no seek time).
in flag
This sounds very much like an [XY problem](https://xyproblem.info/). If you can explain why you think you need exactly zero fragmentation, people can probably give you a better answer. Based on your comments about the filesystem reasonably being read-only though, ext4 is probably not the best choice here, and you should be looking at either [SquashFS](https://www.kernel.org/doc/html/latest/filesystems/squashfs.html) or [EROFS](https://www.kernel.org/doc/html/latest/filesystems/erofs.html) instead (or possibly CramFS, but that’s inferior to either SquashFS or EROFS in most ways).
Mahler avatar
in flag
A solid-state drive (SSD) does not have a head for writing and reading, so defragmentation does not make sense. In SSD, data is stored on memory chips, so it can be extracted much faster.
abukaj avatar
us flag
@AustinHemmelgarn Thanks for the suggestion - I will look into that if I manage to get rid of the ext4 requirement and not the no fragmentation requirement ("reasons"). But my question is purely technical here: "how to do Y". If I wanted to have X solved, I would have asked for it explicitly.
abukaj avatar
us flag
@Mahler http://www.hanselman.com/blog/the-real-and-complete-story-does-windows-defragment-your-ssd
in flag
@Mahler Each individual IO request still has a cost in the OS. It’s not ‘free’ to ask the drive for data, the OS has to set up the region of memory which the data will be transferred to, actually send the request to the storage device (which can be _very_ time consuming depending on how it’s connected, wait for completion (yes, there is a wait, even for an SSD), and then once it has the data clean everything up. That overhead is _per-request_, so by issuing larger requests you get lower overhead and faster bulk data transfers.
Mahler avatar
in flag
When working with SSD drives, modern versions of Windows disable defragmentation by default, using the TRIM function instead.
Mahler avatar
in flag
There is fstrim utility on Linux.
abukaj avatar
us flag
@Mahler How does TRIM fixes the issue of maximum file fragmentation?
Mahler avatar
in flag
TRIM is used instead defrag for SSD.
abukaj avatar
us flag
@Mahler TRIM erases SSD pages. How does it keep file fragmentation below maximum? I mean: _If an SSD gets too fragmented you can hit maximum file fragmentation (when the metadata can’t represent any more file fragments) which will result in errors when you try to write/extend a file._ Not to mention that _more file fragments means more metadata to process while reading/writing a file, which can lead to slower performance._
Mahler avatar
in flag
I don't use defragment or TRIM. My SSD is half full, no problems have been noticed yet.
Mahler avatar
in flag
Possibly Ubuntu does it automatically.
Score:12
cn flag

I want all my files in an ext4 filesystem not fragmented because of reasons.

While there are legitimate reasons to defrag, none require every single file to be defragged and contiguous. The main reasons anyone might want every file to be defragged are OCPD related, which is a complete waste of time because the file system will become "fragmented" again shortly after being mounted rw.

The filesystem... is almost full...

In that scenario, you probably will not be able to defrag every file because Linux defrag programs tend to work at the file level and you do not necessarily have enough contiguous free space to defrag every file.

One idea I have is to move the files to other filesystem and then copy them back...

That is your most viable option. However, specific file allocation is determined by the file system driver.


To reorder blocks it is enough to have one free block as a buffer.

Linux file system devs have not given defrag the same priority that Windows devs have. So the problem is not so much that it is technically impossible, but that no one has bothered to write any programs to do so.

The fs may be set to ro after defragmentation.

Then use a filesystem designed for ro use, like squashfs. All files will be defragged, contiguous, and even compressed.

abukaj avatar
us flag
"_the file system will become "fragmented" again shortly after being mounted rw_" As I mentioned, no file is going to be changed. The fs may be set to ro after defragmentation. "_will not be able to defrag every file_" May you elaborate why is it so? To reorder blocks it is enough to have one free block as a buffer. "_That is your most viable option._" How do I tell the ext4 not to leave empty blocks between files?
xiota avatar
cn flag
Edited to answer some of your questions.
Score:4
cn flag

If some of your files are big, it might be technically impossible to defragment them all without reformating the filesystem.

Any ext4 filesystem is composed of a sequence of block groups. By default, each block group is 128 MiB long.

Each block group starts with a bunch of filesystem metadata (superblock, group descriptor, allocation bitmaps, and inode tables) followed by actual data blocks used by files belonging to that block group. This means that filesystem metadata are scattered mostly uniformly across the entire device.

However, thanks to the optional flex_bg feature, several block groups can be aggregated together into a single bigger one. mke2fs has been creating filesystems by default with 16 block groups packed together since 2008-ish. Assuming you haven't changed this when making the filesystem using the -G option to mkfs, your filesystem is thus likely split into 2-GiB flex groups.

Unless all your files are significantly smaller than 2 GiB, you would thus inevitably run into a situation where the next file to store would have to be fragmented across two or more (flex) block groups. Of course this is guaranteed to happen if any of your files is bigger than the usable data blocks in a (flex) block group.

To achieve your goal, you will thus likely have to reformat the filesystem with a much higher setting of the -G option than the default 16 to make the filesystem use really big flex block groups.

abukaj avatar
us flag
Now I wonder why `e4defrag` reports "success" for files 900M+. ;)
TooTea avatar
cn flag
@abukaj Not sure what exactly you mean by "reporting success", e4defrag can print several different messages meaning different things. But in general, e4defrag is smart enough to calculate the "best possible number of fragments a given file could have with a given block group size" and uses that to find if a file is worth trying to defragment further. (See the source of [`get_best_count()`](https://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git/tree/misc/e4defrag.c#n1015).)
TooTea avatar
cn flag
@abukaj After further checking of the code, I must conclude that my original version of this answer was wrong. It's actually perfectly possible to make a filesystem with huge flex block groups using the standard tools (without any hacking). You might want to un-accept this answer if it doesn't apply anymore.
abukaj avatar
us flag
I meant output like: ```$ e4defrag 986M.file <NL> e4defrag 1.45.5 (07-Jan-2020) <NL> ext4 defragmentation for 986M.file <NL> [1/1]986M.file: 100% [ OK ] <NL> Success: [1/1]``` Meanwhile I ran `sudo e4defrag -v .` and had 986M.file listed as a fragmented file (now/best = 1/1)...
Score:0
ng flag

Back before DOS 6, the usual advice for defragmenting a FAT partition was:

  1. Copy the files from the partition to something else;
  2. Wipe the partition;
  3. Re-create the directory structure on the empty partition; and
  4. Copy the files back.

I never tried this because MS-DOS 6 came out (with its included defrag utility) before defragmenting came to be an issue for me.

abukaj avatar
us flag
The major problem with that approach is that "_Linux does not need defragmentation_" or rather that its filesystems are designed to avoid fragmentation. That means that while FAT fs lines files as close to each other as possible (which leads to file fragmentation if you append to it), extX scatters files around the disk leaving free blocks in case they grow. That means as you approach the filesystem capacity (which should be avoided in most circumstances), your copied files start to be fragmented.
ng flag
@abukaj I'm sure you're correct. And I haven't looked into it deeply enough to offer more than the paraphrased "Four Yorkshiremen" skit. ("You had a File Allocation Table? **Luxury!**") Seems to me that if conventional tools aren't giving you what you want, though, and you insist on ext4, the old-school approach is something you can try. Now that I think of it, because you don't intend to change these files, what if you created a UDF-formatted disk image, moved the files there, and mounted it when you needed to? Would that be too slow?
abukaj avatar
us flag
I think I will either drop ext4 or the "no fragmentation" requirement, whichever is easier (see the accepted answer). BTW: I have not known the original skit.
ng flag
@abukaj [The Four Yorkshiremen](https://youtu.be/VKHFZBUTA4k) is one of Monty Python's more famous sketches. Four well-off men trying to one-up one another over their difficult childhhoods.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.