6

From some foggy memories I thought I would "improve" the default settings when creating a Linux partition and increased the inode size to 1024, and also turned on -O bigalloc ("This ext4 feature enables clustered block allocation").

Now, though, I can't find any concrete benefits to these settings cited on the net, and I see that with 20% disk usage I'm already using 15% of the inodes.

So should I simply reformat the partition, or is there a positive to look on (or to use as justification)? E.g. for directories with lots of files?

Anul
  • 175

4 Answers4

6

Larger inodes are useful if you have many files with a large amount of metadata. The smallest inode size has room for classical metadata: permissions, timestamps, etc., as well as the address of a few blocks for regular files, or the target of short symbolic links. Larger inodes can store extended attributes such as access control lists and SELinux contexts. If there is not enough room for the extended attributes in the inode, they have to be stored in a separate block, which makes opening the file or reading its metadata slower.

Hence you should use a larger inode size if you're planning on having large amounts of extended attributes such as complex ACLs, or if you're using SELinux. SELinux is the primary motivation for larger inodes.

Stephen Kitt
  • 434,908
  • to the point... – Anul Jun 04 '15 at 01:22
  • 1
    a larger inode size would also increase the largest file for which no indirect blocks are required... right? – Anul Jun 04 '15 at 01:23
  • 2
    I think that's possible with very recent kernel versions. Also you can have file data directly in the inode, which can provide a significant benefit for very small files (that fit entirely in the inode). I'm not sure, I haven't kept up recently. – Gilles 'SO- stop being evil' Jun 04 '15 at 11:33
  • 1
    inline_data. It's not yet enabled, at least in Ubuntu releases – Anul Jun 04 '15 at 13:12
2

Larger inode size can help performance for very large files/dirs at the expense of disk usage (and possibly performance for small files).

The bytes-per-inode ratio is what you want to take a closer look at if you feel your inode usage is too high. Many related Q&As on several StackExchange sites.

  • Why would there be a performance penalty for small files? – Anul Jun 02 '15 at 14:42
  • More data to process for them (going through 256 bytes takes longer than going through 1024 bytes) - just a thought, I'm not 100% certain the innards of the FS are are able to process an inode in exactly the same time regardless of the inode size. – Dan Cornilescu Jun 02 '15 at 14:49
  • ah, of course. But that would apply to all files, big and small. – Anul Jun 02 '15 at 15:07
  • Yes, but for the large files having to process fewer inodes would easily offset that - overall better performance. – Dan Cornilescu Jun 02 '15 at 15:16
  • Isn't there one single inode (but multiple blocks) per file...? As shown by ls -i...? – Anul Jun 02 '15 at 16:32
  • Inodes are 'chained' when a single one is insufficient to carry the references to all the file's blocks. That's why a 1024 bytes inode will do better than 4x256 chained inodes (possibly coming from different IO blocks thus requiring multiple disk accesses) for big files. – Dan Cornilescu Jun 02 '15 at 16:38
  • Per http://en.wikipedia.org/wiki/Inode_pointer_structure it would seem there are 12 direct block pointers, 1 indirect, 1 doubly indirect, 1 triply indirect... Perhaps a larger inode allows more direct pointers. Now with bigalloc, the unit is not blocks, but rather 'clusters' (by default =16 blocks). I'm not sure how all this sums up – Anul Jun 02 '15 at 16:52
  • ... but all those block pointers are contained in a single inode – Anul Jun 02 '15 at 17:30
  • Ah, you're right - the chaining I had in mind is to indirect blocks, not inodes. Yes - bigger inodes reduce the indirections. – Dan Cornilescu Jun 02 '15 at 17:41
  • ... but so do bigger 'clusters' (with bigalloc, the unit is not a 4k block, but user-sized 'cluster'). Maybe I should leave inode size alone and specify bigger clusters (with mkfs.ext4 -C) – Anul Jun 02 '15 at 17:46
  • The documentation suggests it's intended for filesystems with mostly huge files, from which I assume it may come with a penalty for small files: https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#Bigalloc. Wouldn't a cluster require multiple disk accesses instead of a single one for a block? I'll stop here as I'm already guessworking - I'm not familiar with bigalloc :) – Dan Cornilescu Jun 02 '15 at 18:12
  • I'm running some experiments... 1st conclusion: large cluster sizes waste LOTS of space. My 20G filesystem had some 300k files, which would occupy 150G even if I increased cluster size only fourfold. I haven't tested performance yet. – Anul Jun 02 '15 at 18:40
1

20% disk usage vs. 15% inode usage is not too bad. 20% disk usage vs. 100% inode usage would be a problem. The question is, will you reach 100% inode usage before 100% disk usage. That's when you need more inodes.

It very much depends on the way you use your filesystem. For example if it's a partition that only holds photos or videos or similar files of consistent size, you probably don't have anything to worry about.

If your usage is random and you're likely to extract a few kernel source tarballs in the future, your current ratio might not hold...

Performance wise you probably won't notice a difference under normal circumstances, as long as you don't have an application that pushes limits, like a database that's hot 24/7 where even minor optimizations pay off.

frostschutz
  • 48,978
  • You mention kernel tarballs because... they would contain lots of small .h files? – Anul Jun 02 '15 at 14:44
  • Yes, well, it's about 50k inodes... and more if you actually compile it. It's usually the reason why I run out of inodes, but then not everyone compiles their own custom kernel. ;) – frostschutz Jun 02 '15 at 14:55
  • A kernel tarball wont contain many small files -- only one. Extracting the tarball is another story. Here's what you can do: Run dumpe2fs to get the number of free inodes and blocks, extract the tarball, run dumpe2fs again and compare output. dumpe2fs -h /dev/disk-whatever |awk '/^Free /'. You're interested in block-size * (free-blocks-start - free-blocks-end) / (free-inodes-start - free-inodes-end). If that ratio is smaller than your bytes-to-inode allocation, you might run into trouble eventually. – Otheus Jun 03 '15 at 08:31
0

Each file is guaranteed to consume at least 1 inode, and more if the files get sufficiently large. In theory, if your partition is going to be made up of lots of large files, you need fewer inodes. Fewer inodes means more disk space for data. Particular applications are partitions for databases in which the database holds most of the data in several large files (ie, Oracle, MySQL with innodb). When you say "bigalloc" I suppose you mean one of the presets in /etc/mke2fs.conf, such as "big" which in CentOS7 sets the inode_ratio to 32768? I'm not sure but I think that essentially is the parameter to mke2fs's -i "bytes-per-inode" parameter. If these are assumptions are correct, yes reformat-copy will be required:

The larger the bytes-per-inode ratio, the fewer inodes will be created. This value generally shouldn't be smaller than the blocksize of the filesystem, since in that case more inodes would be made than can ever be used. Be warned that it is not possible to change this ratio on a filesystem after it is created, so be careful deciding the correct value for this parameter. Note that resizing a filesystem changes the numer of inodes to maintain this ratio.

Otheus
  • 6,138
  • No no, bigalloc as in mkfs.ext4 -O bigalloc, see man ext4 ("This ext4 feature enables clustered block allocation") – Anul Jun 02 '15 at 14:39
  • 1
    Are you sure there are multiple inodes per file? I see direct blocks, indirect blocks, doubly indirect blocks etc, but this is a tree structure rooted in a single inode: http://en.wikipedia.org/wiki/Inode_pointer_structure – Anul Jun 02 '15 at 16:59
  • Thank you. I was thinking that indirect blocks allocated additional inodes. – Otheus Jun 03 '15 at 08:21
  • None of my systems have a manpage for ext4, but they do for mkfs.ext4 and none of them mention bigalloc. – Otheus Jun 03 '15 at 08:32
  • I must have an unusual Linux! Oh wait, it's just the latest Ubuntu (15.04) :) – Anul Jun 03 '15 at 09:02
  • To me, anything published after 2007 is an unusual linux :P – Otheus Jun 03 '15 at 10:21
  • I'm drawing the line at systemd distros. – Anul Jun 03 '15 at 11:21