Score:0

What are the differences between these ext4 features: dir_nlink vs large_dir

sl flag

From the ext4(5) man page:

dir_nlink
Normally, ext4 allows an inode to have no more than 65,000 hard links. This applies to regular files as well as directories, which means that there can be no more than 64,998 sub‐directories in a directory (because each of the '.' and '..' entries, as well as the directory entry for the directory in its parent directory counts as a hard link). This feature lifts this limit by causing ext4 to use a link count of 1 to indicate that the number of hard links to a directory is not known when the link count might exceed the maximum count limit.

large_dir
This feature increases the limit on the number of files per directory by raising the maximum size of directories and, for hashed b-tree directories (see dir_index), the maximum height of the hashed b-tree used to store the directory entries.

Okay, let's take a look at dir_index too.

dir_index
Use hashed b-trees to speed up name lookups in large directories. This feature is supported by ext3 and ext4 file systems, and is ignored by ext2 file systems.


According to my understanding to the man page:

  1. When there is no item in a directory, there are 2 hard links, namely . (the directory pointing to itself) and .. (this points to the parent directory, but the hard link refers to the reference from the parent directory to the directory), and the value of the st_nlink field in the stat structure is 2. When there is 1 item (say, 1 file) in a directory, there are 3 hard links, and st_nlink says 3;
  2. Therefore, to represent the actual number of hard links, st_nlink must be at least 2. As a result, the value 1 is freed to represent something else, instead of having only 1 hard link, which does not make sense;
  3. If the ext4 file system is formatted without dir_nlink, then st_nlink cannot be greater than 65000, and the system refuse to add more items when the limit is reached (I read about that modern kernels may automatically turn on dir_nlink, but let's ignore this for simplicity of the discussion);
  4. If the ext4 file system is formatted with dir_nlink, then when there are more than 65000 items in a directory, the value 1 will be written to the st_nlink field to indicate "unknown number of hard links". Clients (the code using the ext4 file system) must traverse the list of files (the data block of the directory) to count the actual number of items inside it;
  5. large_dir increases the maximum size of directories;
  6. If dir_index is used, large_dir increases the maximum height of the hashed B-Tree.

My Questions

  1. If dir_nlink alone is used, what is the maximum size of a directory? I used to think that it is infinite as long as there is enough data blocks to store the list of files in the directory, but now it seems that it is not.
  2. I think using large_dir alone does not make sense, since the directory size is still capped by 65000. Am I correct?
  3. If large_dir is used, what is the maximum size of a directory?
  4. Is there any disadvantage using large_dir? The reason why I have this question is that in Ubuntu 20.04 LTS, dir_nlink is set by default (see /etc/mke2fs.conf, it is in the list at [fs_types] > ext4 > features), but large_dir is not.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.