0

So, I have a file array list in a bash shell, and I want to sort all the files in the array by the date modified, starting with the oldest being the first one in the array. However, I don't want to sort and modify the original array, and instead want to have the sorted result be in a different array. I saw this thread in which I tried the following command, although I modified it since the array was a variable and not a file.

new_array=( $(ls -t $(printf '%s\n' "${array_list[@]}")) )

However, the array is so big that ls reports the argument list is "too long"

Is there another way I can sort the main array by the modified date, starting with the oldest file at the beginning, and save the results to a different array?

  • 1
    Advice to newcomers: If an answer solves your problem, please accept it by clicking the large check mark (✓) next to it and optionally also up-vote it (up-voting requires at least 15 reputation points). If you found other answers helpful, please up-vote them. Accepting and up-voting helps future readers. – Gilles Quénot Mar 05 '23 at 19:41

2 Answers2

3

With newer versions of GNU ls and bash, you can do:

readarray -td '' new_array < <(
  ls --zero -dt -- "${array_list[@]}")

That doesn't bypass the execve() limit on the length of arguments+environment though as we are executing ls with that list of arguments. You could pass that list to a function or the builtin printf which are not executed via execve() to feed that into a command that can read the list from stdin instead of as arguments.

With recent versions of GNU find (and GNU sort and cut):

print0() { [ "$#" -eq 0 ] || printf '%s\0' "$@"; }
readarray -td '' new_array < <(
  print0 "${array_list[@]}" |
    find -files0-from - -prune -printf '%T@\t%p\0' |
      sort -rzn |
      cut -zf2 -)

Or with GNU stat and assuming none of the array elements are -:

print0() { [ "$#" -eq 0 ] || printf '%s\0' "$@"; }
readarray -td '' new_array < <(
  print0 "${array_list[@]}" |
    xargs -r0 stat --printf='%.Y\t%n\0' --
      sort -rzn |
      cut -zf2 -)

zsh has a stat builtin (which predates the GNU one). So you can do it directly there with something like:

zmodload zsh/stat
typeset -A mtime
stat -nLA mtime -F %s%9. +mtime -- $array_list
new_array=( /(Ne['reply=($array_list)']nOe['REPLY=$mtime[$REPLY]']) )

Here building a $mtime Associative array to map files to their mtime, and sorting the list using the Oe glob qualifier with n (for numerical).

  • @GillesQuénot, the -- is not needed in printf -- '%s\0' "$@" as the format doesn't start with - and no printf implementation is dumb enough to accept options after non-options (like the format here). It would be needed with some printf implementations in printf -- '- %s' *.txt or in printf -- "$format" "$@" where we don't know whether $format may start with - or not. It's needed in zsh's print equivalent: print -rN -- "$@" (where omitting it would even introduce a command injection vulnerability) – Stéphane Chazelas Mar 13 '23 at 06:41
1

Using a one liner and readarray:

readarray -td '' new_array < <( 
    perl -l0e '
        print join "\0",
        sort { -M $a <=> -M $b }
        grep -f, @ARGV
    '  -- "${array_list[@]}"
)

Credits to Stéphane Chazelas for readarray

  • Only zsh can do IFS=$'\0'. In bash, that's the same as IFS=, so no splitting. Also, in bash, unquoted $(...) does splitting – Stéphane Chazelas Feb 20 '23 at 19:57
  • 1
    In bash, use readarray -td '' x < <(perl -l0 -e 'print for sort...' -- "${y[@]}" (also note the -- to avoid command injection vulnerabilities. – Stéphane Chazelas Feb 20 '23 at 19:59
  • Post edited accordingly, thanks – Gilles Quénot Feb 20 '23 at 20:07
  • Related: https://stackoverflow.com/a/75513104/465183 – Gilles Quénot Feb 20 '23 at 21:33
  • 1
    this is subject to ARG_MAX - it would be better to pipe into perl with printf '%s\0' (as in
    Stéphane's answer). Also there's a reason why an earlier answer in the SO question you linked to uses map to build a hash to get the mtime for each filename - it's far cheaper to run stat (even perl's built-in stat function or -M) once per filename rather than twice per comparison in sort().
    – cas Feb 22 '23 at 06:23
  • e.g. something like: readarray -d '' -t new_array < <(printf -- '%s\0' "${array_list[@]}" | perl -0e 'my %files = map { chomp; $_ => (stat($_))[9] } (<>); print join "\0", sort { $b{$_} <=> $a{$_} } keys %files; print "\0"'). Or use Stéphane's print0 shell function in case array_list is empty. – cas Feb 22 '23 at 06:31
  • or, without chomp-ing and join (which I was only using while testing to get newline separated output): readarray -d '' -t new_array < <(printf -- '%s\0' "${array_list[@]}" | perl -0e 'my %hash = map { $_ => (stat($_))[9] } (<>); print sort { $a{$_} <=> $b{$_} } keys %hash') – cas Feb 22 '23 at 06:39