3

I'm writing a bash script and I need to create an array with the 10 most recent image files (from new to old) in the current dir.

I consider "image files" to be files with certain extensions, like .jpg or .png. I only require a few specific image types to be supported, I can also express this in one regex like "\.(jpg|png)$".

My problem is, if I try to do this with e.g. $list=(ls -1t *.jpg *.png | head -10) the resulting list of files somehow becomes one element, instead of each filename being a separate element in my array.

If I try to use $list=(find -E . -iregex ".*(jpg|png)" -maxdepth 1 -type f | head -10), I'm not sure how to sort the list on date/time and keep only the filenames. Also find seems to put ./ in front of every file but I can get rid of that with sed. And also with find I still have the problem of my entire list becoming one entry in the $list array.

4 Answers4

4

The correct syntax is:

list=($(ls -t *.jpg *.png | head -10))
echo First element: ${list[0]}
echo Last element: ${list[9]}

However, this solution will have problems with file names containing space characters (or any white space in general).

FedKad
  • 610
  • 4
  • 17
  • 1
    Thanks, I assume I can overcome the whitespace issue by doing something like IFS=$(echo -en "\n\b") first? – RocketNuts Apr 17 '19 at 13:05
  • But, still some files may contain new line and backspace characters. For example, you can use the following command to create such a file: touch "$(echo -en "xxx\nyyy\bzzz")" – FedKad Apr 18 '19 at 14:39
  • 1
    Why do people do such horrible things ;) But thanks, that's good enough for me. In fact I can also do with just \n as split character. Not sure why the backspace was in there, must have copy/pasted that from somewhere. – RocketNuts Apr 18 '19 at 20:43
3

For bash ≥ 4:

To read output of a command into an array line by line, you should use readarray:

readarray files < <(ls -1t *.jpg *.png | head -10)

... or mapfile:

mapfile -t files < <(ls -1t *.jpg *.png | head -10)

otherwise:

files=()
while IFS= read -r f; do
    files+=( "$f" )
done < <(ls -1t *.jpg *.png | head -10)

See also.


But, filenames are allowed to have linebreaks, so for reading filenames you should rather use find and use \0 delimiter instead of ls -1 which uses \n delimiter:

files=()
while IFS=  read -r -d $'\0' f; do
    files+=("$f")
done < <(
    find . -maxdepth 1 -type f \
      -regextype posix-extended -iregex ".*\.(jpg|png)$" \
      -printf '%T@\t%P\0' \
    | sort -nrz \
    | head -z -n 10 \
    | cut -z -f2-
)
pLumo
  • 22,565
1

If zsh is an option, then it's rather simpler:

set -o nocaseglob
array=( *.(png|jpg)(Om[-10,-1]) )

The set -o nocaseglob allows the simpler png|jpg to match variations in case, such as PNG or JpG.

The next statement assigns an array the results of a very specific filename generation (glob). From left to right:

  • *.(png|jpg) -- expands to the list of filenames that end with .jpg or .png, subject to the case-sensitivity option we enabled
  • (Om ...) -- a zsh "glob qualifier" that says to sort (Order) the files by modification time (oldest to newest)
  • [-10,-1] -- a zsh array splice that takes the ten elements at the end (the ten most recent files)

Once you can parse through the syntax, zsh makes handling these sorts of situations easier because the globbing / filename generation takes care of the filenames for you -- no worries about parsing ls. For example, with the "fun" filenames that I generated in my other answer, the results are:

$ print -l $array
4521.png
a?b.jpg
$( echo boom ).jpg
a*b.jpg
[x].jpg
X▒Y.jpg
single'quote.jpg
backslash.jpg
②.jpg
*.jpg

(the results varied slightly in sequencing because some files had the same timestamp).

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
0

Instead of parsing ls, and if you can rely on the external stat utility and bash v4+ (for associative arrays), you could gather the list of files by inode, then gather a list of the most recent inodes, then build an array of filenames:

shopt -s nocaseglob extglob
declare -a filesbyinode=()
for f in *.@(jpg|png); do filesbyinode[$(stat -c %i "$f")]=$f; done
[ ${#filesbyinode[@]} -gt 0 ] || return
declare wantedfiles=()
for inodes in $(stat -c '%Y %i' *.@(jpg|png) | sort -k1,1rn | awk '{print $2}' | head -10)
do 
  wantedfiles+=("${filesbyinode[$inodes]}")
done
declare -p wantedfiles

The first step is to set two shell options:

  • nocaseglob -- this enables the wildcard jpg to also match JPG (and JpG and ...)
  • extglob -- this enables the use of @(jpg|png) which means: matching filenames can end in either jpg or png (subject to nocaseglob, above)

We then set up an empty associative array that indexes filenames by their inodes.

The subsequent for loop builds up the filesbyinode array with inode indexes (the result of the stat command) and filenames as values.

If there are no files, we bail out with a return -- adjust this as needed for your situation (perhaps an if/else).

We then declare a (regular) array to hold the files that we're interested in. The next for loop iterates over the 10 most recent inodes and adds the corresponding filenames to the array. The 10 most recent inodes are determined by expanding the same wildcard as before, but asking only for the modification time (in seconds since the epoch) and the inodes; after sorting by the modification time in field #1 (largest/most recent first), we peel out the inodes in field #2 with awk and grab the top 10 of those with head.

As a demonstration that the code is safe for various filenames:

for i in $(seq 1 10); do touch $RANDOM.jpg $RANDOM.png $RANDOM.txt; sleep 1.1; done
touch x.jpg '[x].jpg' 'a?b.jpg' 'a*b.jpg' '$( echo boom ).jpg' 
touch single\'quote.jpg double\"quote back\\slash.jpg '*.jpg' ②.jpg

... the output is:

declare -a wantedfiles=([0]="②.jpg" [1]="*.jpg" [2]="single'quote.jpg" [3]="back\\slash.jpg" [4]=$'X\240Y.jpg' [5]="[x].jpg" [6]="a?b.jpg" [7]="a*b.jpg" [8]="\$( echo boom ).jpg" [9]="25396.jpg")

(adjust the last filename for whatever $RANDOM came up with).

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255