13

I'm used to bash's builtin read function in while loops, e.g.:

echo "0 1
      1 1
      1 2
      2 3" |\
while read A B; do
    echo $A + $B | bc;
done

I've been working on some make project, and it became prudent to split files and store intermediary results. As a consequence I often end up shredding single lines into variables. While the following example works pretty well,

head -n1 somefile | while read A B C D E FOO; do [... use vars here ...]; done

it's sort of stupid, because the while loop will never run more than once. But without the while,

head -n1 somefile | read A B C D E FOO; [... use vars here ...]

The read variables are always empty when I use them. I never noticed this behaviour of read, because usually I'd use while loops to process many similar lines. How can I use bash's read builtin without a while loop? Or is there another (or even better) way to read a single line into multiple (!) variables?

Conclusion

The answers teach us, it's a problem of scoping. The statement

 cmd0; cmd1; cmd2 | cmd3; cmd4

is interpreted such that the commands cmd0, cmd1, and cmd4 are executed in the same scope, while the commands cmd2 and cmd3 are each given their own subshell, and consequently different scopes. The original shell is the parent of both subshells.

Bananguin
  • 7,984

3 Answers3

12

It's because the part where you use the vars is a new set of commands. Use this instead:

head somefile | { read A B C D E FOO; echo $A $B $C $D $E $FOO; }

Note that, in this syntax, there must be a space after the { and a ; (semicolon) before the }.  Also -n1 is not necessary; read only reads the first line.

For better understanding, this may help you; it does the same as above:

read A B C D E FOO < <(head somefile); echo $A $B $C $D $E $FOO

Edit:

It's often said that the next two statements do the same:

head somefile | read A B C D E FOO
read A B C D E FOO < <(head somefile)

Well, not exactly. The first one is a pipe from head to bash's read builtin. One process's stdout to another process's stdin.

The second statement is redirection and process substitution. It is handled by bash itself. It creates a FIFO (named pipe, <(...)) that head's output is connected to, and redirects (<) it to the read process.

So far these seem equivalent. But when working with variables it can matter. In the first one the variables are not set after executing. In the second one they are available in the current environment.

Every shell has another behavior in this situation. See that link for which they are. In bash you can work around that behavior with command grouping {}, process substitution (< <()) or Here strings (<<<).

jordanm
  • 42,678
chaos
  • 48,171
  • {} ... I only tried (). Why does the ; in your second example not start "a new set of commands"? Actually I don't understand the < < construct at all. Need to think ... – Bananguin Oct 15 '14 at 09:24
  • @Bananguin see my edit, hope it brings light to that problem. – chaos Oct 15 '14 at 09:48
  • 2
    I think I got it: The actual point is that WHEN USING PIPES, bash creates subshells for each processes of the pipeline, and due to scoping the variables are gone, after the pipeline. Using {} makes the remainder of the line an (one!) element of the pipeline, while inside that subshell the variables are available. Using process substitution one is not using pipes and therefore bash does not spawn subshells. – Bananguin Oct 15 '14 at 10:07
  • @Bananguin you got it ^^ – chaos Oct 15 '14 at 10:20
  • 1
    Since read only reads the first line of input, it is not just -n1 which is redundant but the entire head command. If you eliminate the head command, you can also eliminate the pipeline. Then it becomes read A B C D E FOO < somefile and it won't be a subshell, so the variables will remain available for the next command. – kasperd Oct 15 '14 at 10:46
  • @kasperd: i use head as simple exemplarisch. Usually the heads are greps. – Bananguin Oct 18 '14 at 16:48
  • @Bananguin: head somefile | ( read A B; echo $((A+B)) ) works, too. It uses (…) instead of { …;}, and bash's built-in arithmetic capability instead of bc. – Scott - Слава Україні Feb 12 '15 at 03:45
  • @Scott: ad edit of my question. are you sure removing the pipe symbol does not make the the last paragraph false ...? – Bananguin Feb 12 '15 at 08:52
3

To quote from a very useful article wiki.bash-hackers.org:

This is because the commands of the pipe run in subshells that cannot modify the parent shell. As a result, the variables of the parent shell are not modified (see article: Bash and the process tree).

As the answer has been provided a few times now, an alternative way (using non builtin commands...) is this:

$ eval `echo 0 1 | awk '{print "A="$1";B="$2}'`;echo $B $A
$ 1 0
geedoubleya
  • 4,327
1

As you noted, the problem was that a pipe to read is run in a subshell.

One answer is to use a heredoc:

numbers="01 02"
read first second <<INPUT
$numbers
INPUT
echo $first
echo $second

This method is nice in that it will behave the same way in any POSIX-like shell.

eradman
  • 354