First, a disclaimer: Please don't parse the output of find
. The code below is for illustration only, of how to incorporate command substitution into an Awk script in such a way that the commands can act upon pieces of Awk's input.
To actually do a line count (wc -l
) on each file found with find
(which is the example use case), just use:
find . -type f -name '*txt' -exec wc -l {} +
However, to answer your questions as asked:
Q1
To answer your Q1:
Q1: is there a way to perform command substitution inside awk?
Of course there is a way, from man awk
:
command | getline [var]
Run command piping the output either into $0 or var, as above, and RT.
So ( Watch the quoting !! ):
find . | awk '/txt$/{"wc -l <\"" $NF "\"|cut -f1" | getline(nl); print(nl)}'
Please note that the string built and therefore the command executed is
wc -l <file
To avoid the filename printing of wc
.
Well, I avoided a needed file "close" for that command (safe for a couple of files, but technically incorrect). You actually need to do:
find . | awk '/txt$/{
comm="wc -l <\"" $NF "\" | cut -f1"
comm | getline nl;
close (comm);
print nl
}'
That works for older awk versions also.
Remember to avoid the printing of a dot .
with find .
, that makes the code fail as a dot is a directory and wc can not use that.
Or either, avoid the use of dot values:
find . | awk '/txt$/ && $NF!="." { comm="wc -l <\"" $NF "\" | cut -f1"
comm | getline nl;
close (comm);
print nl
}'
You can convert that to a one-liner, but it will look quite ugly, Me thinks.
Q2
As for your second question:
Q2: why is the first incantation above silently failing and is simply printing the filenames instead?
Because awk does not parse correctly shell commands. It understand the command as:
nl = $(wc -l $NF)
nl --> variable
$ --> pointer to a field
wc --> variable (that has zero value here)
- --> minus sign
l --> variable (that has a null string)
$ --> Pointer to a field
NF --> Last field
Then, l $NF
becomes the concatenation of null and the text inside the las field (a name of a file). The expansion of such text as a numeric variable is the numeric value 0
For awk, it becomes:
nl = $( wc -l $NF)
nl = $ ( 0 - 0 )
Which becomes just $0
, the whole line input, which is (for the simple find of above) only the file name.
So, all the above will only print the filename (well, technically, the whole line).
for
loop example: Why is looping over find's output bad practice? – Wildcard Jul 27 '17 at 22:02wc -l $f
without quoting"$f"
. – Wildcard Jul 27 '17 at 22:02for
loop can be solved with justfind . -type f -name '*txt' -exec wc -l {} +
– Wildcard Jul 27 '17 at 22:46wc -l **/*txt
with globstar on in bash, and similar constructs in some other shells, if the combined filenames don't exceed ARG_MAX. – dave_thompson_085 Jul 28 '17 at 08:10