I am building the pipeline for a set of data and what I have for the main part is something like this
#! /bin/bash
time bwa mem -o bwa/mem/Stettler -M -t 96 -R "@RG\tID:Test\tSM:Stettler\tLB:TestLib\tPL:ILLUMINA" /storage/ppl/wentao/bwa_Index/genome.fa $1 $2
wait
echo "finished mem"
samtools view -Sb -@ 96 -o samtools/Stettler.bam bwa/mem/Stettler
wait
echo "got stettler"
wait
time samtools sort -@ 96 -O bam -o samtools/sort/approachAsortedstettler.bam samtools/Stettler.bam
wait
echo "sorted"
time samtools index samtools/sort/approachAsortedstettler.bam
wait
echo "finished indexing"
time gatk MarkDuplicates -I samtools/sort/approachAsortedstettler.bam -O GATK/MarkDuplicates/ApproachAsortedstettler.bam -M GATK/MarkDuplicates/metrics/ApproachB
wait
echo "Marked Duplicates"
time samtools index GATK/MarkDuplicates/ApproachAsortedstettler.bam
wait
echo "indexed again ++++++++++++++++++++++++++++++++++++++++"
time bash scripts/Parallelhaplo.sh
wait
echo "Parallelhaplo"
time bash scripts/MergerHAplo.sh
wait
echo "merged"
time vcftools --vcf GATK/MergedSample_gather.raw.vcf --min-meanDP $3 --recode --out vcftools/MergedGATKdp2.vcf
wait
echo "deep checked"
time gatk IndexFeatureFile --feature-file vcftools/MergedGATKdp2.vcf.recode.vcf
wait
echo "IFF"
time gatk SelectVariants -R /storage/ppl/wentao/GATK_R_index/genome.fa --variant vcftools/MergedGATKdp2.vcf.recode.vcf --concordance vcftools/Mergedmpileupdp2.vcf.recode.vcf -O GATK/SelectVariants/Common$
wait
echo "finished"
and the process called parallel Haplo looks like this
#!/bin/bash
#parallel call SNPs with chromosomes by GATK
for i in 1 2 3 4 5 6 7;do for o in A B D;do for u in _part1 _part2;do (gatk
HaplotypeCaller -R /storage/ppl/wentao/GATK_R_index/genome.fa -I
GATK/MarkDuplicates/ApproachAsortedstettler.bam -L chr$i$o$u -O
GATK/HaplotypeCaller/HaploSample.chr$i$o$u.raw.vcf &);done;done ; done
gatk HaplotypeCaller -R /storage/ppl/wentao/GATK_R_index/genome.fa -I
GATK/MarkDuplicates/ApproachBsortedstettler.bam -L chrUn -O
GATK/HaplotypeCaller/HaploSample.chrUn.raw.vcf&
wait
echo "parallel call finished"
wait
However when I then execute the script what usually happens is that ParallelHaplo is started but for some reason the wait on any of the two scripts doesn't wait for it to finish so it goes to the next step and since the next step cant find the files I just get errors. What can I then do?
&
outside the subshell parentheses? So instead ofdo (gatk ... & ); done
in ParallelHaplo, trydo (gatk ...) & done
. – terdon Jul 04 '19 at 17:44sleep
command and can reproduce your issue when using( sleep & )
and can confirm it works as expected when using(sleep) &
. By the way, you may be interested in our sister site: [bioinformatics.se]. – terdon Jul 04 '19 at 17:58