3

I am trying to grab 100 lines after the text "time: X" with X in {0,40,80,...,200}. Here is what I have so far:

#!/bin/bash
start=1
end=5
for i in $(seq $start $end);do 
  j=$(($i*40))
  awk '/time: $j/{for(i=1;i<=100;i++}{getline;print}}' file > fileX-$j.txt
done 

However this doesn't seem to work. My question specifically is about variable $j and how I need to define it right after '/time: ...'

For example, I have a file named 'file':

time: 1
1 2 3 
1 33 1 
2 31 4
time: 40
2 1 3 
9 8 77
1 3 4

I'd like to make 2 separate files in this case; first one containing

1 2 3 
1 33 1 
2 31 4

and second one with:

1 2 3 
1 33 1 
2 31 4

I tried passing on $j as a variable as mazs mentioned but still gives me empty files. Here is how i did:

awk -v jj=$j '/time: jj/{for(i=1;i<3;i++){getline;print}}' file > fileX-$j.txt
Lucas
  • 2,845
  • You don't do it like that: awk has a command line option for passing variables - see Use a script parameter in awk – steeldriver Apr 07 '16 at 13:26
  • 1
    Please [edit] your question and show us an example of your input and your desired output. The main issue is that $j is not expanded in the awk script but we could give a better solution if you explain what you are trying to do. This sort of thing is rarely a good match for bash . – terdon Apr 07 '16 at 13:30

2 Answers2

3

There are two problems. The first is that the shell doesn't expand $j inside single quotes: '$j' tells the shell that you want the string $j, not the value of the variable j.

In this case, because the value only contains digits, you could put it outside of single quotes:

awk '/time: '"$j"'/{for(i=1;i<=100;i++}{getline;print}}' file > fileX-"$j".txt

Note that if the value of j contained regexp special characters (., *, etc.) then those characters would be interpreted as such. For example

j='2*3'
awk '/foo '"$j"' bar/'

the script would print lines containing things like foo 3 bar, foo 23 bar, foo 223 bar, etc. and not foo 2*3 bar. And if there was a / in the value then awk would see the end of the regex matching construct; for example

j='2/3'
awk '/foo '"$j"' bar/'

would result in awk complaining that the sequence of tokens /foo 2/, 3, bar, / is not syntactically correct.

You can define a variable for an awk with the -v command line option:

j='a\tb'
awk -v j="$j" '{print j}'

Note that this performs backslash expansion on the value of j. For example, the snippet above replaces each line by a↦b where is a tab character.

But that doesn't directly work in your case, because awk doesn't expand variables inside /…/: /foo/ matches the string foo, not the value of the variable foo. To use a variable in a regular expression match, you would need to use the match function:

awk -v j="$j" 'match($0, "time: "+j) {for(i=1;i<=100;i++}{getline;print}}' file > fileX-"$j".txt

This one works for values of j that don't contain a backslash; a slash is ok. For example, with j set to a/b*c, this would match lines like time: a/c, time: a/bc, etc. With j set to \t, this would match lines containing time: followed by a space and a tab.

To pass the value of a shell variable to awk, no matter what the value is, pass it through the environment.

export j
awk 'match($0, "time: "+j) {for(i=1;i<=100;i++}{getline;print}}' file > fileX-"$j".txt

or, to avoid having j stay in the environment for the rest of the script:

j="$j" awk 'match($0, "time: "+j) {for(i=1;i<=100;i++}{getline;print}}' file > fileX-"$j".txt

And if you wanted to search for a literal string, rather than for a regular expression, you could use the index function instead of match. For example

j='a*b'
awk 'index($0, "time: "+j)'

prints lines containing time: a*b.

-1

You have to pass the $j shell variable to awk:

awk -v jj="$j" '...'

Note that this assumes that the value of the variable does not contain backslashes, as the argument to awk -v undergoes backslash expansion.

magor
  • 3,752
  • 2
  • 13
  • 28