0

I want to find a list of directories, then search for a given file inside them, then choose the most recent one.

This is what I try:

find /Temp -type d -name Ast 2>/dev/null | while read Dir; do find $Dir -type f -name Pagination.json 2>/dev/null -exec ls -lt {} +; done

This indeed brings me the intended files, but it sorts them ascendingly.

This is the result of that command:

-rw-r--r-- 1 root root 46667 Sep 12 18:10 /Temp/ProjectOne/Site/Ast/BaseComponents/Pagination.json
-rw-r--r-- 1 root root 46667 Sep 13 09:31 /Temp/ProjectTwo/Site/Ast/BaseComponents/Pagination.json

In this case, I need the second item. What should I do?

  • Why the double find? You could do find /Temp -path '*/Ast/*' -type f -name 'Pagination.json' -exec ... – muru Sep 13 '23 at 09:33
  • Once you do that simplification, you could use the command in https://unix.stackexchange.com/a/282904/70524 (omitting the xargs grep) to get the most recently modified file – muru Sep 13 '23 at 09:43
  • @muru, your find simplification really helped. But that link is harder than a simple ls -lt. – Saeed Neamati Sep 13 '23 at 09:50
  • 1
    Maybe, but parsing ls -l output is still not that simple even with new versions of GNU ls. https://unix.stackexchange.com/q/754193/70524 ... And that reminds me that I had posted basically the same find | sort | sed thing years ago: https://unix.stackexchange.com/a/198050/70524 – muru Sep 13 '23 at 09:56
  • @muru, you're right. But in my case I have the guarantee that no whitespace in path or filename exists. In fact, every character in path and file name is ASCII. – Saeed Neamati Sep 13 '23 at 10:01
  • It's not just that, the date format can also change unless you use --time-style like in that post, and even then you have to be sure that ls won't be invoked multiple times (because there were too many arguments, say), in which case ls would sort one set of files, then sort the next set and so on, and so a simple tail -n 1 wouldn't be sufficient. Also space is ASCII. – muru Sep 13 '23 at 10:10
  • @muru, yes you're right. I'm speechless. – Saeed Neamati Sep 13 '23 at 10:12
  • 1
    Ignoring the other problems (and inefficiencies) in your command, it sounds like you could have solved your probably just by adding r to the options of to ls. – Henrik supports the community Sep 13 '23 at 13:00
  • 1
    A problem with your command that I don't see mentioned yet is that in won't work if there are too many files matching, as you will then end up with multiple invocations of ls that each will sort it's output, but the output from the different invocations won't be mixed, so you'll effectively just the files in a random order. – Henrik supports the community Sep 13 '23 at 13:03

2 Answers2

2

The way I'd approach this is using the shell's own file-finding abilities to find all candidates, then keep the newest one:

#!/bin/bash

enable ** as recursive glob, don't fail when null matches are found, and

also look into things starting with .

shopt -s globstar nullglob dotglob

newestmod=0 for candidate in /Ast//Pagination.json ; do # check file type: [[ -f ${candidate} ]] || continue [[ -L ${candidate} ]] && continue # Get modification time in seconds since epoch without fractional # part. Assumes GNU stat or compatible. thisdate=$(stat -c '%Y' -- "${candidate}")

# if older than the newest found, skip 
[[ ${thisdate} -lt ${newestmod} ]] && continue

newestmod=${thisdate}
newestfile="${candidate}"

done

if (( newestmod )); then printf 'Newest file: "%s"\n' "${newestfile}" fi

or such.

In zsh, that whole thing becomes a bit less complicated and supports subsecond precision in timestamps:

#!/usr/bin/zsh

#get the list of regular (.) files, ordered by modification date allcandidates=(/Ast//Pagination.json(ND.om)) if (( $#allcondidates )) print -r Newest file: $allcandidates[1]

Or just:

print -r Newest file: **/Ast/**/Pagination.json(D.om[1])

Beware that while **/ in zsh and bash5.0+ don't follow symlinks when recursively traversing the directory tree, the Ast/ part would traverse symlinks. If that were a concern, in zsh, you could work around it with:

set -o extendedglob
print -r Newest file: ./**/Pagination.json~^*/Ast/*(D.om[1])

Where ./**/Pagination.json finds all the files without traversing symlinks but ~pattern removes the paths that match the pattern, here the ones that don't (^) contain /Ast/.

2

It changes a little depending upon what you precisely mean by "most recent", but wherever the GNU implementation of find is available, and if we know there are not file paths containing newline characters, I would use something like:

find /tmp -type f -printf "%T@ %p\n" | sort -rn

(adjust /tmp -type f to find the files you actually care about, it will probably be something like find /Temp -path '*/Ast/*' -type f -name 'Pagination.json'). The interesting part is %T@ which prints file's last modification time in seconds since epoch, that's quite easy to sort by.

  • Stéphane correctly edited the answer to say that you need GNU find, and that my command would not work if there were file paths with newline characters in them. That's correct and I mostly don't care about those, but if that is a problem, you can substitute \0 for \n in the printf format, and add a z option to sort – Henrik supports the community Sep 14 '23 at 08:27