0

Im trying to

for ds in $(cat Nikhil/strings.txt )
do
awk -v k="$ds" '{if (substr($0,2,5) == k )print $0}' criteria5.txt
done

but when i execute this it is taking the value of k as k instead of the variable . Shell check .net is not working so had to post this here

+ for ds in '$(cat Nikhil/strings.txt )'
  • awk -v k=CDA01 '{if (substr($0,2,5) == k )print $0}' criteria5.txt

This is how the output is when i tried to debug using sh -x . Any suggestions please

Nikhil
  • 35
  • 2
    but this lines says it doesn't + awk -v k=CDA01 '{if (substr($0,2,5) == k )print $0}' criteria5.txt, see k=CDA01; also do not use shell-loops for text-processing purposes – αғsнιη Jan 07 '21 at 18:51
  • It is calling the variable CDA01 from the file , but inside the if condition it is substituting as k – Nikhil Jan 07 '21 at 18:54
  • 2
    no, it doesn't you can verify it by printing if (substr($0,2,5) == k )print k, $0}; – αғsнιη Jan 07 '21 at 18:55
  • is the code wrong ? – Nikhil Jan 07 '21 at 19:11
  • 2
    @Nikhil I can see nothing wrong with your code. The shell would not substitute k with anything inside the awk code for the purpose of showing it in the set -x trace output, but you can rest assured that -v k="$ds" will definitely make the value of the k variable that of $ds in the shell. If the code does what you want it to do, none of us can tell. In this case though, it may be more convenient to not use a shell loop and instead use a more sophisticated awk command (which would be much quicker). – Kusalananda Jan 07 '21 at 19:17
  • Thanks for the @Kusalananda. can you elaborate what is the sophisticated awk command. If you can give me a lead i will try to dig in – Nikhil Jan 07 '21 at 19:31
  • 2
    @Nikhil There are many examples of similar issues. See e.g. here, here, here and here (but really, this is a very common issue (performing a relational JOIN between two files) and the solution is always on the same form, unless one could use join as I in my answer to that last example). – Kusalananda Jan 07 '21 at 19:59

1 Answers1

1

I cannot reproduce what's happening in your codes but I see all is syntaxically fine and working; but here is what you seems are trying to do with shell-loops that you would better to avid it. Below I used awk command alone.

gawk 'ARGIND==2{ strings[$0]; next; };
     (substr($0, 2, 5) in strings);
' RS='[[:space:]]+' strings.txt RS='\n' criteria.txt

Here we are reading all strings (split on whitespaces or \newline [[:space:]] in "strings.txt" file into an awk associated array (we name it strings).

The ATGIND==2 controlling the input and to run first block of the code only for the "strings.txt" file and awk will skip running that block for next input(s). see here why we preferred this over more common use of NR==FNR.

when next file "criteria.txt" opened for processing, we set Record Separator to default \newline in RS='\n'; and in (substr($0, 2, 5) in strings) we are checking if the expected substring start from position 2 with character length of 5 exist in our array or not, if it was, then the line will go to output .

Input data: strings.txt:

CDA01 CDA02
   CDA03
CDA03    CDA05 CDA06

criteria.txt:

xCDA01 something
xxCDA02 someotherthing
CDA01
vcCDA03 oCDA03
vCDA05 end

Output:

xCDA01 something
vCDA05 end
αғsнιη
  • 41,407