1

I'm trying to create a script, which should read a video folder and create a list of video files to be processed by ffprobe to identify the codec. Videos NOT processed with a specific codec (in this case HEVC) should be put in new list for further processing by ffmpeg.

I created a very rudimentary script, but hit a brick wall at a point where the variable ffprobe_input needs to be changed in order to be passed as the next input for ffprobe.

Also, even if this part of the script was working, I'm puzzled as to how to create the filtered list of files after the ffprobe processing, since the only output is a single word, ex: hevc or x264.

The actual script is below, alongside with my notes, which should be more descriptive, also in the notes are some of the ways I tried to make things work.

This is the intended use of the script: ./script.sh -p /path\ to\ videos

#!/bin/bash

#Read path (-p) input and exit on error. while getopts p: flag do case "${flag}" in p) vpath=${OPTARG};; *) echo "usage: $0 [-p]" >&2 exit 1 ;; esac done

#Now we echo the path for neatness echo -e "Selected root video path: $vpath";

#Check if the path is valid. The path must be escaped. Cd into the folder and execute: printf "%q\n" "$(pwd)" [ -d "$vpath" ] && echo "Directory $vpath exists." || echo "Error: Directory $vpath does not exist. Tip: make sure the spaces are escaped in folder names, ex: ===video\ folder===."

#Prepare a list of video files with full escaped paths,ready for ffprobe/ffmpeg input. find "$vpath" -type f ( -iname ".mkv" -o -iname ".mp4" -o -iname "*.avi" ) | sed 's/ /\ /g' >> full_list.txt

#read the total number of lines from full_list.txt nrl_total="$(wc -l full_list.txt | grep -Eo "[0-9]{0,7}")" echo -e "There are a total of $nrl_total videos for further processing."

#read line number and pass to $ffprobe_input

nrl=($(seq 1 "$nrl_total"))

nrl={1..$nrl_total..1}

for $nlr in {1..$nrl_total..1}; do

nrl=({1..$nrl_total..1})

filename='full_list.txt' nrl=1 while read line; do echo "$nrl" nrl=$((n+1)) #done < $filename

#ffprobe_input="$(sed -n 1p full_list.txt)" Use line number in "p" attribute, ex: 1p.

ffprobe_input="$(sed -n 1p full_list.txt)"

ffprobe_input=&quot;$(sed -n &quot;$nrl&quot;p full_list.txt)&quot;

#Now pass the input to ffprobe to determine if the videos are HEVC or not. Output is single word, ex: hevc or x264. eval ffprobe -v error -select_streams v:0 -show_entries stream=codec_name -of default=noprint_wrappers=1:nokey=1 -i "$ffprobe_input"

done < $filename

rm full_list.txt

  • So, you want to read the file full_list.txt, and run ffprobe once for each line and store the results with the filenames? – ilkkachu Nov 25 '21 at 12:32
  • Correct. The idea is to process full_list.txt and based on ffprobe output to create a non_hevc.txt list to be passed in the same fashion to ffmpeg. – neckfreak Nov 25 '21 at 12:35

1 Answers1

1

Assuming your filenames don't contain newlines, you don't need to mangle them in any way. The output from file has one line per filename, so just store it and loop over the resulting file:

> non-hevc.txt        # clear the output list
find "$vpath" -type f \( -iname "*.mkv" -o -iname "*.mp4" -o -iname "*.avi" \) \
 > full_list.txt
while IFS= read -r file; do 
    result=$(ffprobe -v error -select_streams v:0 -show_entries \
             stream=codec_name -of default=noprint_wrappers=1:nokey=1 -i "$file")
    if [ "$result" != hevc ]; then
        echo "$file" >> non-hevc.txt
    fi
done < full_list.txt
rm -f full_list.txt

Here, the output of ffprobe is captured with the command substitution $(...) and stored to result, which we then look at.

I don't see any reason for the dance with sed -n "$nrl"p inside the loop reading the filename list, since read already reads the same line. We do need IFS= and -r to not mangle the input, though.

There's also no reason to escape any whitespace with backslashes, the quoted expansion of "$file" passes the contents of the variable as-is to the command. Undoing the escaping would also be difficult, when you use eval, it processes a lot of other stuff too, and would barf on e.g. parenthesis.

Not sure if you want to append the output of find to whatever full_list.txt already contained, or recreate the list. Since we process the list immediately, it seem to me to make more sense to ignore any old contents.

Note that like terdon comments, you don't strictly need the intermediate file to store the list of filenames. You could do just find ... | while IFS= read file, do ..., or with process substitution in Bash/ksh/zsh while IFS= read file, do ... done < <(find ...). The difference between the two matters if you want to set variables inside the while loop, see: Why is my variable local in one 'while read' loop, but not in another seemingly similar loop?

ilkkachu
  • 138,973
  • Fantastic! I knew I was going down some very complicated path. Learning is fun. Thanks. – neckfreak Nov 25 '21 at 13:00
  • Is there any benefit in writing the find output to a file instead of just passing it directly to the while loop? Using a file means that it could overwrite an existing file with the same name if one exists and is taking up disk space for no reason. The OP's script ends with rm full_list.txt so avoiding the file entirely would probably be best – terdon Nov 25 '21 at 13:00
  • @terdon, not really, but I don't think the file matters much, so skipped on adding that change. – ilkkachu Nov 25 '21 at 13:05
  • @terdon I'm in the process of teaching myself to use anything more complex than piping a couple of commands. The script above represents my logic/knowledge at the first time i started to write a script I could use for myself. The rm command is in, because subsequent uses of the script kept on adding new lines with the same paths. It was meant to keep the folder the script is in tidy. – neckfreak Nov 25 '21 at 13:11
  • @ilkkachu In my mind, having the .txt files around was supposed to serve as a log of sorts. Not all video files can be processed in a unified fashion and if the script is used for a larger library, there are bound to be hiccups. – neckfreak Nov 25 '21 at 13:29