Which filesystem will give me the best performance for both writing and reading small files?

Question

My company developed an application that picks up (watches for) xml files of <10kb in size from a directory, reads it in sends the body as an api call to an external service and then moves the file into a processed directory.

Due to the volume of files - roughly 2000/min we were getting dreadful performance out of NTFS. We were no where near able to keep up with the processing.

I'm a Linux guy through and through and from experience Linux would handle this situation a lot better especially with things like inotify which are leaps and bounds ahead of the ntfs api, that's why I've ported the code to .NET Core to give it a shot.

At home I use XFS on my Workstations and ZFS on my servers, so aside from ext4 - I have no real experience with any other filesystem.

So my question is - which filesystem (preferably in-tree) would be the most performant for this kind of workload.

What about the hardware (the drive, where your file system will reside)? — sudodus, Nov 04 '18 at 20:24
Currently it's on hyper-v with shared vhdx, it's going to move over to ESXi - I don't know the exact specs of the hypervisor. — user3861788, Nov 04 '18 at 20:33
If the data is small in relation to the amount of memory then I would look at tmpfs; basically a RAMdisk. It's non-persistent, but is great for temporary files. So code your primary loop to work from tmpfs, and then have a batch (once an hour?) move processed files to persistent storage (if it's needed). — Stephen Harris, Nov 04 '18 at 21:49
@ctrl-alt-delor It is probably nearly impossible for the OP to change his root filesystem. — peterh, Nov 04 '18 at 22:05
See https://unix.stackexchange.com/questions/28756/what-is-the-most-high-performance-linux-filesystem-for-storing-a-lot-of-small-fi — Panther, Nov 04 '18 at 22:28
@peterh if the OP could not change it, then they would not ask. — ctrl-alt-delor, Nov 05 '18 at 10:35
@StephenHarris what you are describing is caching. You will probably not be better at it then the kernel is: once you have read the data, it will all be in ram (cached). It will only fall out of cache if the data set is large. — ctrl-alt-delor, Nov 05 '18 at 10:36

Which filesystem will give me the best performance for both writing and reading small files?

0 Answers0