Currently I have the following script for using the HaploTypeCaller program on my Unix system on a repeatable environment I created:
#!/bin/bash
#parallel call SNPs with chromosomes by GATK
for i in 1 2 3 4 5 6 7
do
for o in A B D
do
for u in _part1 _part2
do
(gatk HaplotypeCaller \
-R /storage/ppl/wentao/GATK_R_index/genome.fa \
-I GATK/MarkDuplicates/ApproachBsortedstettler.bam \
-L chr$i$o$u \
-O GATK/HaplotypeCaller/HaploSample.chr$i$o$u.raw.vcf &)
done
done
done
gatk HaplotypeCaller \
-R /storage/ppl/wentao/GATK_R_index/genome.fa \
-I GATK/MarkDuplicates/ApproachBsortedstettler.bam \
-L chrUn \
-O GATK/HaplotypeCaller/HaploSample.chrUn.raw.vcf&
How can I change this piece of code to parallel at least partially? Is it worth to do I am trying to incorporate this whole script in a different script that you can see on a different question here should I? Will I get quite the boost on performance?
chrUn
more than once? – Ole Tange Jul 12 '19 at 10:14