4

I need to run my Python script 200,000 times. Is it possible to do parallel execution using bash? Since it's 200,000 times I would like to use at least 10 threads simultaneously

Michael Homer
  • 76,565
Rahul
  • 41
  • 1
  • 1
  • 2

1 Answers1

8

Let's say this.py contains the following:

#!/usr/bin/python
from datetime import datetime
now = datetime.now()
print now

The following in Bash, would execute 10 processes of this.py simultaneously in 20,000 rounds. The next round starts after the 10 processes have completed. This will allow you to execute this.py 200,000 times, while only using 10 threads at a time.

#!/bin/bash
for i in {1..20000}; do
  echo -e "\nROUND $i\n"
  for j in {1..10}; do
    /path/to/this.py &
  done
  wait
done 2>/dev/null

...or use the one-liner:

for i in {1..20000}; do echo -e "\nROUND $i\n"; for j in {1..10}; do /path/to/this.py & done; wait; done 2>/dev/null

You can obviously exclude the echo line. I just did that for testing purposes and to show the pretty output when STDERR is redirected to /dev/null. My output would look like this:

ROUND 1

2015-10-09 23:20:12.432295
2015-10-09 23:20:12.444988`
2015-10-09 23:20:12.471788
2015-10-09 23:20:12.482884
2015-10-09 23:20:12.519446
2015-10-09 23:20:12.558949
2015-10-09 23:20:12.560826
2015-10-09 23:20:12.582571
2015-10-09 23:20:12.600680
2015-10-09 23:20:12.625727

ROUND 2

2015-10-09 23:20:12.761279
2015-10-09 23:20:12.764459
2015-10-09 23:20:12.801361
2015-10-09 23:20:12.831900
2015-10-09 23:20:12.853339
2015-10-09 23:20:12.877965
2015-10-09 23:20:12.921946
2015-10-09 23:20:12.950549
2015-10-09 23:20:12.973625
2015-10-09 23:20:12.986714

ROUND 3

2015-10-09 23:20:13.128276
2015-10-09 23:20:13.169144
2015-10-09 23:20:13.222183
2015-10-09 23:20:13.234889
2015-10-09 23:20:13.242653
2015-10-09 23:20:13.246504
2015-10-09 23:20:13.305419
2015-10-09 23:20:13.306198
2015-10-09 23:20:13.317769
2015-10-09 23:20:13.328895

...etc.

Also look into GNU Parallel, although I think it may be limited to running as many simultaneous jobs as number of cores you have. You might could avoid that caveat by running multiple parallel processes. It's an awesome replacement for loops and such.

rubynorails
  • 2,293
  • How can i populate the db with all the records? when i run the above command its not getting populated – Rahul Oct 13 '15 at 19:47
  • @Rahul - I'm not sure I understand. I guess this is dependent on what is contained in your python script. If you are running MySQL commands inside this script or something, you could get unexpected results. I'm not as familiar with SQL, so I can't really comment on the specifics unless I have more information. I just based my answer on your question, which may have been a little too vague. – rubynorails Oct 13 '15 at 22:27
  • 2
    GNU Parallel defaults to running number-of-cores jobs in parallel, but you can change the default with -j200 for 200 jobs in parallel. – Ole Tange Apr 10 '17 at 14:18