I would like to delete all tab-delimited columns from a text file in which the header (first line) contains the string "_HET". The input text file looks like this:
rs36810213_HET rs2438689 rs70927523570_HET rs54666437 ...
1 0 2 0
0 1 0 1
2 0 1 1
... ... ... ...
The output text file should look like this:
rs2438689 rs54666437 ...
0 0
1 1
0 1
... ...
The code I am using does not remove anything:
#!/bin/bash
path="/data/folder"
awk -v OFS='\t' '
NR==1{
for (i=1;i<=NF;i++)
if ($i=="_HET") {
n=i-1
m=NF-(i==NF)
}
}
{
for(i=1;i<=NF;i+=1+(i==n))
printf "%s%s",$i,i==m?ORS:OFS
}
' $path/input.txt >> $path/output.txt
Any suggestions on how to fix this code? Thank you!
OFS
must be set to the same thing asFS
, and you setFS
to a tab using-F '\t'
as shown. – Kusalananda May 27 '19 at 13:50