0

I'm trying to include some awk commands in a bash script, and I've run across some unexpected behavior. Can you give me a clue as to what I'm overlooking?

For example, given a file named list:

1
2
3
4

This simple awk command does what you'd expect:

$ awk -F, '{ print $1 }' list
1
2
3
4

But if I put this command in a bash script:

#!/bin/bash
echo "list:"
cat $1

echo "list after awk:"
echo `awk -F, '{ print $1 }' $1`

exit 0

I get this output:

$ ./script list
list:
1
2
3
4
list after awk:
1 2 3 4

The awk in the bash script has mysteriously stripped away the carriage returns.

I've seen this behavior in bash on OS X as well as in zsh on BSD.

Any ideas?

plamtrue
  • 123

3 Answers3

4

It's not awk, it's in how the shell expands things.

Let's take an example:

$ a="1
> 2
> 3
> 4"

So we've created a variable over 4 lines. But...

$ echo $a
1 2 3 4

So why is it only on one line?

$ echo "$a"
1
2
3
4

Ah, it isn't!

So there's something magic around echo $a.

We can see some similar magic with spaces:

$ a="1    2    3    4"
$ echo $a
1 2 3 4
$ echo "$a"
1    2    3    4

Now it's not really echo that's doing the magic, here; it's the shell. Without the "..." wrapper the shell will try and expand the variable. So globs would be expanded:

$ ls
a  file

$ a="*"
$ echo $a
a file
$ echo "$a"
*

In the same way the shell is taking your output with returns in them, expanding them, and then returning then as parameters; the return is being lost.

In summary... if you want to protect this sort of expansion from happening, use "...".

In your short example, though, you don't need echo at all. Just call awk directly.

2

Observe the difference in output between these two commands:

$ echo `awk -F, '{ print $1 }' $1`
1 2 3 4
$ echo "`awk -F, '{ print $1 }' $1`"
1
2
3
4

In the first command, because the command substitution is unquoted, the output from awk is subjected to word splitting and pathname expansion. Word splitting removes the newlines and replaces them with spaces. In the second, because awk is in double quotes, no word splitting is performed and the newlines are retained.

Documentation

From man bash:

Word Splitting

The shell scans the results of parameter expansion, command substitution, and arithmetic expansion that did not occur within double quotes for word splitting.

The shell treats each character of IFS as a delimiter, and splits the results of the other expansions into words using these characters as field terminators. If IFS is unset, or its value is exactly <space><tab><newline>, the default, then sequences of <space>, <tab>, and <newline> at the beginning and end of the results of the previous expansions are ignored, and any sequence of IFS characters not at the beginning or end serves to delimit words. If IFS has a value other than the default, then sequences of the whitespace characters space and tab are ignored at the beginning and end of the word, as long as the whitespace character is in the value of IFS (an IFS whitespace character). Any character in IFS that is not IFS whitespace, along with any adjacent IFS whitespace characters, delimits a field. A sequence of IFS whitespace characters is also treated as a delimiter. If the value of IFS is null, no word splitting occurs.

Explicit null arguments ("" or '') are retained. Unquoted implicit null arguments, resulting from the expansion of parameters that have no values, are removed. If a parameter with no value is expanded within double quotes, a null argument results and is retained.

Note that if no expansion occurs, no splitting is performed.

John1024
  • 74,655
  • So would it be accurate to say: when a shell - initially - parses a command it uses blank characters to distinguish between WORDs. (e.g. so it knows which WORDS are commands, which are options, arguments etc) after the first round of expansions is performed the command statement is now significantly different to the initial command, therefore the shell may need to re-perform the inital task of identifying the WORDs. This process is called "word splitting"? – the_velour_fog Aug 31 '16 at 06:26
  • @the_velour_fog Yes, that's about it. More specifically, "word splitting" is only performed on the results of parameter expansion, command substitution, and arithmetic expansion. – John1024 Aug 31 '16 at 17:24
1

Not awk, the echo stripped. Leave echo, just awk:

awk -F, '{ print $1 }' $1
Ipor Sircer
  • 14,546
  • 1
  • 27
  • 39