From my understanding, $1
is the first field. But strangely enough, awk '$1=$1'
omits extra spaces.
$ echo "$string"
foo foo bar bar
$ echo "$string" | awk '$1=$1'
foo foo bar bar
Why is this happening?
From my understanding, $1
is the first field. But strangely enough, awk '$1=$1'
omits extra spaces.
$ echo "$string"
foo foo bar bar
$ echo "$string" | awk '$1=$1'
foo foo bar bar
Why is this happening?
When we assign a value to a field variable ie. value of $1
is assigned to field $1
, awk actually rebuilds its $0
by concatenating them with default field delimiter(or OFS
) space.
We can get the same case in the following scenarios as well...
echo -e "foo foo\tbar\t\tbar" | awk '$1=$1'
foo foo bar bar
echo -e "foo foo\tbar\t\tbar" | awk -v OFS=',' '$1=$1'
foo,foo,bar,bar
echo -e "foo foo\tbar\t\tbar" | awk '$3=1'
foo foo 1 bar
For GNU AWK this behavior is documented here:
https://www.gnu.org/software/gawk/manual/html_node/Changing-Fields.html
$1 = $1 # force record to be reconstituted
awk '$1=$1'
printing the current record after recompiling it, try echo -e "0\tbar\t\tbar" | awk '$1=$1'
. Always do awk '{$1=$1}1'
instead and in general only use an action in a conditional context if you need the result of that action to be evaluated as a condition. The only other thing worth mentioning is that assigning to a field will also remove all leading and/or trailing spaces from the record when you use the default FS.
– Ed Morton
Feb 20 '20 at 15:22
1
is a common AWK trick to print the current record, but it does make the program harder to understand for people unfamiliar with the trick in question.
– Stephen Kitt
Feb 21 '20 at 10:19
echo "$string" | awk '$1=$1'
causes AWK to evaluate $1=$1
, which assigns the field to itself, and has the side-effect of re-evaluating $0
; then AWK considers the value of the expression, and because it’s non-zero and non-empty, it executes the default action, which is to print $0
.
The extra spaces are removed when AWK re-evaluates $0
: it does so by concatenating all the fields using OFS
as a separator, and that’s a single space by default. When AWK parses a record, $0
contains the whole record, as-is, and $1
to $NF
contain the fields, without the separators; when any field is assigned to, $0
is reconstructed from the field values.
Whether AWK outputs anything in this example is dependent on the input:
echo "0 0" | awk '$1=$1'
won’t output anything. $1=$1
evaluates to whatever is in the first field, which is 0
in this case; that’s a “false” result in AWK, so nothing happens and nothing is output. To avoid that, turn $1=$1
into an action and make AWK print the current record in all cases:
| awk '{$1=$1}1'
1
causes AWK to always run the default action.