1

What is the least expensive way to find the oldest file in a directory, including all directories underneath. Assume directory is backed by SAN and under heavy load.

There is concern that "ls" could be locking and cause system degradation under heavy load.

Edit: Find performs very well under a simple test case - find oldest file amongst 400 gigs of files on an SSD drive took 1/20 seconds. But this is a MacBook Pro Laptop under no load... So it's a bit of an apples to oranges test case.

And as an aside what is the best way to find out implementations (underlying algorithms) for such commands?

2 Answers2

2

With zsh:

oldest=(**/*(.DOm[1]))

For the oldest regular file (zsh time resolution is to the second)

With GNU tools:

(export LC_ALL=C
 find . -type f -printf '%T@\t%p\0' |
   sort -zg | tr '\0\n' '\n\0' | head -n 1 |
   cut -f2- | tr '\0' '\n')
0

To minimize the number of external processes, you may be able to optimize by running a custom script instead of a proper find. The directory traversal and stat() of each file cannot be optimized away, but you only need to keep the oldest file so far in memory.

Here is an attempt in Perl:

find2perl -eval 'BEGIN { our ($filename, $oldest); }
    my @s=stat(_); if (! defined $::oldest || $s[9] < $::oldest) {
        $::oldest=$s[9]; $::filename = $File::Find::name }
    END { print "$::filename\n" }' | perl

In my tests, on a moderately large directory (129019 nodes), this is actually about 50% slower than @StephaneChazelas "GNU Tools" version, but you may find that it works better in some scenarios, especially for really large directories.

tripleee
  • 7,699
  • If you prefer Python, http://stackoverflow.com/questions/7541863/python-equivalent-of-find2perl has some hints. – tripleee Jul 18 '13 at 07:19