0

For a all files in a specified directory with a specified filter (png|gif|jpe?g) (! with \0 separator to avoid filenames with spaces problems), I wanted to see if the "optimized" webp file exists, and if not, convert it.

I know what follows can be optimized, etc. but I just want to make it work.

If you have a better solution (with explanation!) this can be interesting too, of course.

IFS= readarray -t -d '' tab < <(find . -type f -print0  | grep -zZE "(png|gif|jpe?g)$") && for f in "${tab[@]}"; do if [ ! -f "$f.webp" ]; then cwebp -q 80 "$f" -o "$f.webp"; fi done

Here's what I did: I'm making an array via readarray of all found files that match my pattern. Then I do a loop on them, where I test if the file exists. If not, I call cwebp -q 80 "$f" -o "$f.webp"; This one doesn't work with the following errors. Why?

Error! Could not process file ./08/10700_header.jpg
Error! Cannot read input picture file './08/10700_header.jpg'
Error! Could not process file ./08/205790_header.jpg
Error! Cannot read input picture file './08/205790_header.jpg'

2 Answers2

2

A better solution in bash: you don't really need find:

shopt -s extglob  # extended pattern match, you likely already have it set
shopt -s globstar # extended directory level search ('**' matches any directory level)

for f in **/*.@(jpg|jpeg|png|gif)
do 
    [[ -f "$f.webp" ]] || cwebp -q 80 "$f" -o "$f.webp"
done

To run with parallel we have to get rid of the already done files so that we can call cwebp directly. A way is to filter the list of files with the list of webp files:

printf '%s\n' **/*.@(jpg|jpeg|png|gif) \
    | grep -vf <(printf '%s\n' **/*.webp | sed 's/\.webp$//') \
    | parallel -i cwebp -q 80 {} -o {}.webp

In slow-mo:

  • printf '%s\n' **/*.@(jpg|jpeg|png|gif) generate the list of all possible candidates as a stream (since this is the printf built-i, in bash it is not command-line constrained)
  • grep -vf <(printf '%s\n' **/*.webp | sed 's/\.webp$//') removes from that list all the files that already have an associated .webp (by listing the *.webp files, truncating their extension, and using the result as a grep pattern list)
  • parallel -i cwebp -q 80 {} -o {}.webp feeds the result to parallel for execution.

Note that since parallel doesn't seem to have a parameter to take null-terminated input, you just have to hope that you haven't got weird filenames.

The pre-filtering technique can also be used for a non-parallel case.

xenoid
  • 8,888
0

I'm not entirely sure what produces the errors that you show or why, but presumably it's cwebp that is having issues finding the pathnames that you give it.

I would have let find find the relevant files and execute the loop rather than using grep and a temporary array:

find . \( -name '*.png' -o -name '*.gif' -o -name '*.jpg' -o -name '*.jpeg' \) \
        -type f -exec sh -c '
        for pathname do
                [ -f "$pathname.webp" ] && continue
                cwebp -q 80 "$pathname" -o "$pathname.webp"
        done' sh {} +

Here, I use find as a sort of pathname generator for an in-line sh -c script. The in-line script gets pathnames in batches and loops over them. For each given pathname, if the corresponding .webp file exists, the loop skips to the next pathname. Otherwise, the cwebp command is invoked.

This avoids issues with strange file or directory names, and would be portable to any system that has cwebp installed (the find command itself is standard, as is the in-line script).

See also "Understanding the -exec option of `find`" for more information about the -exec sh -c '...' sh {} + syntax.

If this still does not work, you may want to investigate the image files reported to make sure they are actually valid image files.

Kusalananda
  • 333,661