5

We have Linux Redhat version 7.2 , with xfs file system.

from /etc/fstab

/dev/mapper/vgCLU_HDP-root /                       xfs     defaults        0 0
UUID=7de1ab5c-b605-4b6f-bdf1-f1e8658fb9 /boot                   xfs     defaults        0 0
/dev/mapper/vg
/dev/mapper/vgCLU_HDP-root /                       xfs     defaults        0 0
UUID=7de1dc5c-b605-4a6f-bdf1-f1e869f6ffb9 /boot                   xfs     defaults        0 0
/dev/mapper/vgCLU_HDP-var /var                    xfs     defaults        0 0 var /var                    xfs     defaults        0 0

The machines are used for hadoop clusters.

I just thinking what is the best file-system for this purpose?

So what is better EXT4, or XFS regarding that machines are used for hadoop cluster?

yael
  • 13,106

2 Answers2

7

This is addressed in this knowledge base article; the main consideration for you will be the support levels available: Ext4 is supported up to 50TB, XFS up to 500TB. For really big data, you’d probably end up looking at shared storage, which by default means GFS2 on RHEL 7, except that for Hadoop you’d use HDFS or GlusterFS.

For local storage on RHEL the default is XFS and you should generally use that unless you have specific reasons not to.

Stephen Kitt
  • 434,908
  • lets say we have kafka machine with disk with 10T , so in that case in spite the disk is small then 50T , and regarding we have kafka machine , what is the most fit FS - ? – yael Nov 07 '18 at 07:45
  • “For local storage on RHEL the default is XFS and you should generally use that unless you have specific reasons not to.” – Stephen Kitt Nov 07 '18 at 07:49
  • as of 2022 the internet states the ext4 filesystem can support volumes with sizes up to 1 exbibyte (EiB) and single files with sizes up to 16 tebibytes (TiB) with the standard 4 KiB block size.* However the operating system support is different... Red Hat’s maximum supported size for Ext4 is 16TB in both Red Hat Enterprise Linux 5 and Red Hat Enterprise Linux 6, and 50TB in Red Hat Enterprise Linux 7. per redhat article 3129891 dated 9/4/2022. – ron Aug 04 '22 at 14:48
  • @ron yes, that’s the article linked in my answer. – Stephen Kitt Aug 04 '22 at 14:51
4

XFS is an amazing filesystem, especially for large files. If your load involves lots of small files, cleaning up any fragmentation periodically may improve performance. I don't worry about it and use XFS for all loads. It is well supported, so no reason not to use it.

Set aside a machine and disk for your own testing of various filesystems, if you want to find out what is best for your typical work load. Working the test load in steps over the entire disk can tell you something about how the filesystem being tested works.

Testing your load on your machine is the only way to be sure.