78

Having the following in one of my shell functions:

function _process () {
  awk -v l="$line" '
  BEGIN {p=0}
  /'"$1"'/ {p=1}
  END{ if(p) print l >> "outfile.txt" }
  '
}

, so when called as _process $arg, $arg gets passed as $1, and used as a search pattern. It works this way, because shell expands $1 in place of awk pattern! Also l can be used inside awk program, being declared with -v l="$line". All fine.

Is it possible in same manner give pattern to search as a variable?

Following will not work,

awk -v l="$line" -v search="$pattern" '
  BEGIN {p=0}
  /search/ {p=1}
  END{ if(p) print l >> "outfile.txt" }
  '

,as awk will not interpret /search/ as a variable, but instead literally.

branquito
  • 1,027
  • 1
    What you're searching for is not text that matches a "pattern", it's text that matches either a string or a regular expression. See how-do-i-find-the-text-that-matches-a-pattern for why that matters and why you shouldn't use the word "pattern" in this context. – Ed Morton May 26 '21 at 15:23
  • 1
    See also how-do-i-use-shell-variables-in-an-awk-script for a comprehensive answer to the question of how to pass the value of shell variables or other values to an awk script. – Ed Morton May 26 '21 at 16:05
  • 1
    I've been reading a lot of info about awk and passing variables, etc. topics so apologies if I am a bit confused, but, although above answers are very good, IMO, fail to address one of OPs questions, whether it is possible or not to use "/search/" as variable. To me it sounds it is not possible, but I fail to see why or where is this stated – Kiteloopdesign Aug 10 '22 at 09:55

5 Answers5

57

Use awk's ~ operator, and you don't need to provide a literal regex on the right-hand side:

function _process () {
    awk -v l="$line" -v pattern="$1" '
        $0 ~ pattern {p=1; exit} 
        END {if(p) print l >> "outfile.txt"}
    '  
}

Here calling exit upon the first match as we don't need to read the rest. You don't even need awk, grep would be enough and likely more efficient and avoid the problem of awk's -v var='value' doing backslash processing:

function _process () {
    grep -qe "$1" && printf '%s\n' "$line"
}

Depending on the pattern, you may want grep -Eqe "$1"

glenn jackman
  • 85,964
  • This is exactly what solves this in a way I wanted (1st example), because it keeps the semantics, which was my goal. Thanks. – branquito Mar 21 '14 at 15:30
  • 1
    I didn't note the removal of the BEGIN block: an unassigned variable is treated as 0 in a numeric context or the empty string otherwise. So, an unassigned variable will be false in if (p) ... – glenn jackman Mar 21 '14 at 15:35
  • yes I noticed, it needs to be set on BEGIN block to zero each time, as it serves as a switch. But interestingly I tried now script using $0 ~ pattern, and it does not work, however with /'"$1"'/ it does work!? :O – branquito Mar 21 '14 at 15:42
  • maybe it has something to do with the way $line is retrieved, pattern search is done on the output of whois $line, $line coming from file in a WHILE DO block. – branquito Mar 21 '14 at 15:53
  • Please show the contents of $line -- do it in your question for proper formatting. – glenn jackman Mar 21 '14 at 15:57
  • Don't write /$0 ~ search/ -- leave out the slashes: $0 ~ search – glenn jackman Mar 21 '14 at 16:15
  • It seems that this: awk '$0 ~ /foo/ { print $0 }' is actually equivalent to this: awk -v pattern=foo '$0 ~ pattern { print $0 }. In other words, the // brackets are not needed anymore at all, because pattern becomes a dynamic regexp. Is that right? – XXX Mar 21 '16 at 07:05
  • Yes that's right. – glenn jackman Mar 21 '16 at 10:14
22
awk  -v pattern="$1" '$0 ~ pattern'

Has an issue in that awk expands the ANSI C escape sequences (like \n for newline, \f for form feed, \\ for backslash and so on) in $1. So it becomes an issue if $1 contains backslash characters which is common in regular expressions (with GNU awk 4.2 or above, values that start with @/ and end in /, are also a problem). Another approach that doesn't suffer from that issue is to write it:

PATTERN=$1 awk '$0 ~ ENVIRON["PATTERN"]'

How bad it's going to be will depend on the awk implementation.

$ nawk -v 'a=\.' 'BEGIN {print a}'
.
$ mawk -v 'a=\.' 'BEGIN {print a}'
\.
$ busybox awk -v 'a=\.' 'BEGIN {print a}'
.
$ gawk -v 'a=\.' 'BEGIN {print a}'
gawk: warning: escape sequence `\.' treated as plain `.'
.
$ gawk5.0.1 -v 'a=@/foo/' BEGIN {print a}'
foo

All awks work the same for valid escape sequences though:

$ a='\\-\b' awk 'BEGIN {print ENVIRON["a"]}' | od -vtx1 -tc
0000000  5c  5c  2d  5c  62  0a
          \   \   -   \   b  \n
0000006

(content of $a passed as-is)

$ awk -v a='\\-\b' 'BEGIN {print a}' | od -vtx1 -tc
0000000  5c  2d  08  0a
          \   -  \b  \n
0000004

(\\ changed to \ and \b changed to a backspace character).

  • So you are saying that if pattern was for example \d{3} to find three digits, that wouldn't work as expected, if I understood you well? – branquito Mar 21 '14 at 16:24
  • 2
    for \d which is not a valid C escape sequence, that depends on your awk implementation (run awk -v 'a=\d{3}' 'BEGIN{print a}' to check). But for \\ or \b, yes definitely. (BTW, I don't know of any awk implementations that understands \d as meaning a digit). – Stéphane Chazelas Mar 21 '14 at 16:30
  • it says: awk warning - escape sequence \d' treated as plaind' d{3}, so I guess I would have a problem in this case? – branquito Mar 21 '14 at 16:34
  • so to resume, I am safe with ENVIRON["pattern"] approach? – branquito Mar 21 '14 at 16:42
  • Yes. see there for more reading. – Stéphane Chazelas Mar 21 '14 at 16:43
  • Ok, again not working.. when I replace $0 ~ pattern for $0 ~ ENVIRON["pattern"], I get matches everywhere, so all my lines from infile get copied over to outfile. First one was working ok (one with $0 ~ pattern) – branquito Mar 21 '14 at 16:51
  • 1
    Sorry, my bad, I had a typo in my answer. The name of then environment variable has to match ENVIRON["PATTERN"] for the PATTERN environment variable. If you want to use a shell variable, you need to export it first (export variable) or use the ENV=VALUE awk '...ENVIRON["ENV"]' env-var passing syntax as in my answer. – Stéphane Chazelas Mar 21 '14 at 17:03
  • it works if PATTERN set each time while looping in front of awk, if I put PATTERN="$1" before the loop it does not work.. why? – branquito Mar 21 '14 at 17:14
  • 2
    Because you need to export a shell variable for it to be passed in the environment to a command. – Stéphane Chazelas Mar 21 '14 at 17:20
  • Even in same shell script, didn't know that, it works now, just changed PATTERN="$1" to export PATTERN="$1". Maybe because awk was sent input through the pipe.. Anyway learned a lot! – branquito Mar 21 '14 at 17:24
5

Try something like:

awk -v l="$line" -v search="$pattern" 'BEGIN {p=0}; { if ( match( $0, search )) {p=1}}; END{ if(p) print l >> "outfile.txt" }'
  • If this behaves same as /regex/ in terms of finding pattern, this could be a nice solution. I will try. – branquito Mar 21 '14 at 15:22
  • 1
    The quick tests I ran seemed to work the same, but I won't even begin to guarantee it... :) – Hunter Eidson Mar 21 '14 at 15:24
  • But what if $pattern (and therefore search) contains something that can be treated as a part of regex, like [? And I want to search for a literal match? – A S Nov 28 '21 at 04:40
-1

No, but you can simply interpolate the pattern into the double-quoted string you pass to awk:

awk -v l="$line" "BEGIN {p=0}; /$pattern/ {p=1}; END{ if(p) print l >> \"outfile.txt\" }"

Note that you now have to escape the double-quoted awk literal, but it is still the simplest way of accomplishing this.

  • Is this way safe if $pattern contains spaces, my example from above will work as $1 is protected with "$1" double quotes, however not shure what happens in your case. – branquito Mar 21 '14 at 15:14
  • 2
    Your original example ends the single-quoted string at the second ', then protects the $1 via double quotes and then tacks another single-quoted string for the second half of the awk program. If I understand correctly, this should have exactly the same effect as protecting the $1 via the outer single quotes - awk never sees the double quotes that you put around it. – Kilian Foth Mar 21 '14 at 15:26
  • 5
    But if $pattern contains ^/ {system("rm -rf /")};, then you're in big trouble. – Stéphane Chazelas Mar 21 '14 at 16:17
  • is that downside of this approach only, having all wrapped in "" ? – branquito Mar 21 '14 at 16:27
-3

You could use the eval function which resolves in this example the nets variable before the awk is run.

nets="searchtext"
eval "awk '/"${nets}"/'" file.txt
Noxy
  • 3
Noxy
  • 1