Ok so i have a bash function that I apply on several folders:
function task(){
do_thing1
do_thing2
do_thing3
...
}
I want to run that function in parallel. So far I was using a little fork trick:
N=4 #core number
for temp_subj in ${raw_dir}/MRST*
do
((i=i%N)); ((i++==0)) && wait
task "$temp_subj" &
done
And it works great. But I decided to get something 'cleaner' and use GNU parallel:
ls -d ${raw_dir}/MRST* | parallel task {}
Problem is it's putting EVERYTHING in parallel, including the do_thing within my task function. And it is inevitably crashing because those have to be executed in a serial fashion. I tried to modify the call to parallel in many ways but nothing seems to work. Any ideas?
xargs
. You have toexport
the function and usexargs -n1 -P4
calling a subshell like into this post. If you want to use parallel, similar thing, like this post. Also do not use thels
output for making the arguments, usefind
or glob expressions. – thanasisp Nov 20 '20 at 20:21xargs -n1 -P4
with a subshell but it has the same issue asparallel
. All thedo_thing
commands within my function are being executed at the same time as if the function is being treated just as a list of commands instead of "as a whole" just like using a&
would do. – Orchid Nov 20 '20 at 23:02task
is the minimum unit that will be executed in your example. When you say in your description thattask "$temp_subj" &
works well, it means you are ok with that. You run 4 tasks in parallel and each task intentionally is not running any of its commands asynchronously. If you mean to parallelize the calls insidetask
then you have to rephrase the question. – thanasisp Nov 20 '20 at 23:08