4

In my bash script I have a variable that I am trying to pass to a pattern to search for using awk. However what I expected to happen is not working. I have the following text file (text.txt):

-----------
Task:           a
 (some info)
  ....
------------
Task:           b 
 (some info)
  ....
------------
Task:           c
 (some info)
  ....
------------

My script has the following:

letter=a
awk -v var="$letter" '/Task .* \var/' RS='-+' text.txt

When I do this however I get nothing but if I do the following:

awk '/Task .* a/' RS='-+' text.txt

I get what I expect:

Task:           a
 (some info)
  ....

NOTE: I need to pass it as a variable because I have a loop that is constantly changing the variable and that's what I am trying to look for. I'd rather use awk since that what I am most familiar with but I am not opposed to hearing other suggestions such as sed or grep.

pdm
  • 65

2 Answers2

9

You could pass the whole pattern to awk

letter=a
awk -v pattern="Task .* $letter" -v RS='-+' '
    $0 ~ pattern
' text.txt

or construct the pattern as a string in awk

letter=a
awk -v ltr="$letter" -v RS='-+' '
    BEGIN {pattern = "Task .* " ltr}
    $0 ~ pattern
' text.txt

Since awk variables are not prefixed with $, you can't embed them inside a /regex constant/ -- it's just text in there.

(It's my preference to put all awk variables at the front with -v)

cuonglm
  • 153,898
glenn jackman
  • 85,964
  • why not $1 == "Task:" && $2 == ltr ? – Archemar Jul 16 '15 at 13:45
  • That's a good idea. However if the contents of ltr contains whitespace (or generally, a field separator), can't match it with a single field comparison. – glenn jackman Jul 16 '15 at 13:47
  • One question though what is the $0 ~ pattern do? – pdm Jul 16 '15 at 13:51
  • @pdm It means the line contains whatever come after. $0 is the line. ~ is the contains operator. – 123 Jul 16 '15 at 13:56
  • 1
    Pedantically, ~ is the regular expression matching operator: string ~ regex -- https://www.gnu.org/software/gawk/manual/html_node/Comparison-Operators.html#Comparison-Operators – glenn jackman Jul 16 '15 at 14:00
  • Ohhhh! That makes sense. Thank you for the clarification. You all have been incredibly helpful. – pdm Jul 16 '15 at 14:00
  • @glennjackman Yeah, i knew that wasn't the real name writing that but couldn't remember it and thought it would do for explanation purposes. Obviously the real name and your link is more useful though. – 123 Jul 16 '15 at 14:03
  • @glennjackman And if I may add this, this is not even an awk specialty, as some people may think...unless you insist on using a very ancient bash, the ~ operator will also work with the test shell command (aka [[ ]]). If more people knew about this, a lot of bash scripts would look way less messy, since the alternate method (i. e. pattern matching) will often make the expression harder to follow than a regular expression would... – syntaxerror Jul 16 '15 at 17:47
  • 1
    One correction, test and [ are equivalent, and [[ has some different behaviour. The bash =~ works with [[ but not [ or test. – glenn jackman Jul 16 '15 at 17:55
  • 1
    @syntaxerror - one needn't insist on using a very ancient bash to render that statement untrue - there are many shells in which [[ ... =~ ... ]] is invalid syntax - and all of these that I'm aware of are far faster in practically every respect than any version of bash I've ever used. In any case, ~ is the C-language bitwise NOT (and so also performs that function in shell $(( arithmetic expansions ))) and I've always found the portable case construct to be both more useful and more readable than [[ ... =~ ... ]] anyway, (especially because it's faster, too). – mikeserv Jul 16 '15 at 18:09
  • @mikeserv Oh you're right, of course I was actually referring to the combined operator ~= where the tilde is followed by an = operator. Good catch of yours (as usual). :) – syntaxerror Jul 16 '15 at 18:12
  • Almost: this won't work with backslashes. cuonglm's answer is the correct one here. – Gilles 'SO- stop being evil' Jul 16 '15 at 22:21
  • What if I wanted to just print the first record separated? How would I modify this command? I tried doing ... 'NR==1($0 ~ pattern)... but this didn't work. – pdm Jul 20 '15 at 14:20
  • 1
    If you just want the first one: $0 ~ pattern {print; exit} – glenn jackman Jul 20 '15 at 14:27
4

Your best choice maybe passing variable through environment:

letter=a
p="Task: *$letter" awk -v RS='-+' '$0 ~ ENVIRON["p"]' <file

or:

p="Task: *a" awk -v RS='-+' '$0 ~ ENVIRON["p"]' <file

Using -v var=value, awk will expand escape sequences in value. If you want to pass data as-is to awk from shell, -v var="$shell_var" is not reliable.

Using ENVIRON (or ARGV) is a more reliable since when awk doesn't expand escape sequences in it.

cuonglm
  • 153,898