I have been using a for loop to run a pipeline for multiple files but unfortunately the terminal froze halfway. I would like to run the pipeline again but because of time I would like to skip the directories that already has the output files created. Basically nest a if statement - if file output file exists, ignore if not run pipeline. Is this possible?
for f in /Volumes/My\ Passport/Documents/Projects/untitled\ folder\ 2/untitled\ folder\ 3/untitled\ folder\ 2/untitled\ folder/*/*_1.fastq; do
subdir="${f%/*}"
pushd "$subdir" &>/dev/null
file1="${f##*/}"
file2="${file1%_1.fastq}_2.fastq"
adapter="/Volumes/My\ Passport/Documents/adapters.fa"
reference="/Volumes/My\ Passport/Documents/ucsc_hg19/ucsc.hg19.fasta"
dbSNP="/Volumes/My\ Passport/Documents/ucsc_hg19/dbsnp_138.hg19"
COSMIC="/Volumes/My\ Passport/Documents/ucsc_hg19/CosmicCodingMuts.vcf"
interval="/Volumes/My\ Passport/Documents/plist.bed"
sjdb="/Volumes/My\ Passport/Documents/ucsc_hg19/ucsc.hg19.gtf"
file3="${file1%_1.fastq}_1_trimmed.fastq"
file4="${file2%_2.fastq}_2_trimmed.fastq"
#preQC (cutadapt -O subtracted, prinseq -min_qual_score 4 -ns_max_p 2 subtracted)
~/Desktop/UTSW/Applications/bbmap/bbduk.sh -Xmx120g in1="${file1}" in2="${file2}" out1="${file1%_1.fastq}_1_trimmed.fastq" out2="${file2%_2.fastq}_2_trimmed.fastq" ref="${adapter}" trimq=10
paste - - - - < "${file3}" | sort -k1,1 -t " " | tr "\t" "\n" > "${file3%_1_trimmed.fastq}_trimmed_sorted_1.fastq"
paste - - - - < "${file4}" | sort -k1,1 -t " " | tr "\t" "\n" > "${file4%_2_trimmed.fastq}_trimmed_sorted_2.fastq"
parallel -j $PARALLEL_TASKS perl ~/UTSW/Applications/prinseq-lite-0.20.4/prinseq-lite.pl -fastq "${file3%_1_trimmed.fastq}_trimmed_sorted_1.fastq" -fastq2 "${file4%_2_trimmed.fastq}_trimmed_sorted_2.fastq" -no_qual_header -trim_right 1 -custom_params "A 75%;T 75%;G 75%;C 75%" min_qual_mean 25 -min_len 40 -out_format 3 -out_good "${f%.*}_QC" -out_bad null -log
done