My situation is:
I have many files with a name timestep1_i
for many values of i and I am trying to combine all the files for timestep1 with a program called combinefiles
. I want to do it in parallel (for timestep2 timestep3 ...) using find | xargs.
To do so I have a bash function called combine_and_move:
function combine_and_move {
file_name1=`echo $1 | sed 's/timestep1_0/timestep1/'`
echo "*** file $file_name1 ***"
$UTIL_DIR/combinefiles $file_name1 && mv "$file_name1"_* uncombined_files
}
export -f combine_and_move
This function receives a filename, then calls the program that will combine all the files together and (it is supposed to) moves the already combined files to another directory. This last part is supposedly indicated by the &&
.
Now I call the function using find | xargs like this:
find . -maxdepth 1 -name 'timestep*_0' -print0 | xargs --replace=@ -0 -P 64 bash -c 'combine_and_move $"@"' _ &
On the understanding that the process may be killed randomly by the system administrator.
The function sort of works, but sometimes if the process is killed, I find myself with uncombined files that were moved to the uncombined_files
folder. Apparently meaning that the second command ran even though the first one did not succeed.
What am I doing wrong? What can I change so that the function combine_and_move only move the files if the combinefiles
program was succesful?
mv
would never run ifcombinefiles
didn't complete successfully. An issue with your globs is more likely: for instance, if you have two files namedtimestep1_0
andtimestep1_0_0
,combine_and_move timestep1_0_0
will executemv ./timestep1_0_0 uncombined_files
whilecombine_and_move timestep1_0
will executemv ./timestep1_0 ./timestep1_0_0 uncombined_files
. Note thattimestep1_0_0
is moved in both cases. – fra-san Sep 10 '20 at 19:03timestep*_0
will likely wreak havoc). – fra-san Sep 10 '20 at 19:23