I'm trying to match filenames that contain two text patterns but the matching process should ignore case. Neither of the following regular expressions work:
Setting the awk variable 'IGNORECASE' to a nonzero value (as recommended in info awk
) so that all regular expression and string operations ignore case, and then building a logical "and" operation using two regular expressions prints all files:
$ ls -R | awk 'IGNORECASE = 1;/bingo/ && /number/;'
I tried converting the data to lowercase before using lookaheads (I know the second lookahead is not needed) to match both the text patterns "bingo" and "number". However awk does not print any output which it should by default 1, 2
$ ls -R | awk 'tolower($0) ~ /(?=.*bingo)(?=.*number)/'
Which part of the awk or regular expression syntax is wrong (or what is missing) and what is the correct way to do a case-independent search that is only successful when the additional pattern appears on the same line?
Update:
from running
$ ls -R | awk '/bingo/'
it seems that awk
may be performing the match against the lines in each file in the output of ls -R
due to filenames not containing the string constant "bingo" being matched by awk
. If this is the case, how do you get awk
to have the same behavior as grep
when receiving output from (i.e. sent through) a pipe?
IGNORECASE
is only supported by GNU awk (gawk
) -- so if you have that, usegawk
instead ofawk
. Lookaheads/behinds are not supported by any awk implementation. – Aug 04 '19 at 18:22ls
? – Cyrus Aug 04 '19 at 18:23find . -iname '*bingo*' -iname '*number*'
would be more suitable? – steeldriver Aug 04 '19 at 18:25awk
does not support lookaheads or lookbehinds, thanks. I tried replacingawk
withgawk
in the first code example butgawk
still lists all files – bit Aug 04 '19 at 19:39awk
is searching each file for the text pattern instead of searching the output fromls
likegrep
does? – bit Aug 04 '19 at 19:43IGNORECASE=1
as a pattern, which will be always true (same as a simple1;
). Usegawk -vIGNORECASE=1 '/bingo/ && /number/'
or put theIGNORECASE=1
assignment in aBEGIN
block. – Aug 04 '19 at 19:54ls -R | awk 'BEGIN {IGNORECASE=1} /bingo/ && /number/'
andls -R | awk -vIGNORECASE=1 /bingo/ && /number/'
both work, so it seems that regularawk
also supportsIGNORECASE
. Do you know why your version i.e.ls -R | gawk -vIGNORECASE=1 /foo/ || /bar/'
also prints filenames that do not contain foo or bar, e.g. baz.txt is also printed. May be you can turn your comment into question. – bit Aug 04 '19 at 20:05awk
doesn't support it. It's probably that theawk
on your system is actuallygawk
--awk --version
will tell you if that's the case. That cannot be assumed even on a per-distro basis -- on debian the user can change it withupdate-alternatives --set awk /usr/bin/gawk
. – Aug 04 '19 at 20:12find
) yes it does: fromman find
: Where an operator is missing, -a is assumed. (-a
being logical AND) – steeldriver Aug 04 '19 at 21:12||
instead of&&
was an error in a previous version of the comment, but even with||
,touch baz.txt; ls -R | gawk -vIGNORECASE=1 '/foo/ || /bar/'
will not print thebaz.txt
file. – Aug 04 '19 at 22:54