Why must I put the command read into a subshell while using pipeline

Question

The command df . can show us which device we are on. For example,

me@ubuntu1804:~$ df .
Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/sdb1       61664044 8510340  49991644  15% /home

Now I want to get the string /dev/sdb1.

I tried like this but it didn't work: df . | read a; read a b; echo "$a", this command gave me an empty output. But df . | (read a; read a b; echo "$a") will work as expected.

I'm kind of confused now.

I know that (read a; read a b; echo "$a") is a subshell, but I don't know why I have to make a subshell here. As my understanding, x|y will redirect the output of x to the input of y. Why read a; read a b; echo $a can't get the input but a subshell can?

Note that for the actual task you may prefer eg. a=$(findmnt --noheadings --output SOURCE $(stat --printf=%m .)), avoiding any parsing of command output. — Michał Politowski, Dec 01 '20 at 12:32

score 12 · Accepted Answer · answered Dec 01 '20 at 03:10

The main problem here is grouping the commands correctly. Subshells are a secondary issue.

x|y will redirect the output of x to the input of y

Yes, but x | y; z isn't going to redirect the output of x to both y and z.

In df . | read a; read a b; echo "$a", the pipeline only connects df . and read a, the other commands have no connection to that pipeline. You have to group the reads together: df . | { read a; read a b; } or df . | (read a; read a b) for the pipeline to be connected to both of them.

However, now comes the subshell issue: commands in a pipeline are run in a subshell, so setting a variable in them doesn't affect the parent shell. So the echo command has to be in the same subshell as the reads. So: df . | { read a; read a b; echo "$a"; }.

Now whether you use ( ... ) or { ...; } makes no particular difference here since the commands in a pipeline are run in subshells anyway.

glenn jackman · Answer 2 · 2020-12-01T03:21:21.187

An alternative is to use a process substition:

{ read header; read filesystem rest; } < <(df .)
echo "$filesystem"

The <(...) process substitution executes the contained script (in a subshell), but it acts like a filename, so you need the first < to redirect the contents (which is the output of the script) into the braced script. The grouped commands are executed in the current shel;.

It can be tricky to get this readable, but you can put any arbitrary whitespace into the braces and the process substitition.

{
    read header
    read filesystem rest
} < <(
    df .
)
echo "$filesystem"

And it might be easier to use an external tool to extract the filesystem:

filesystem=$( df . | awk 'NR == 2 {print $1}' )

score 3 · Answer 3 · answered Dec 01 '20 at 03:11

Your first command

df . | read a; read a b; echo "$a"

effectively gets interpreted as

( df . | read a ) ; read a b; echo "$a"

So the pipeline only feeds into the read a command.

Since you want multiple reads from the pipeline then you need to group the commands together.

Now it doesn't have to be a subshell; it could be a grouping..

bash-4.2$ df | { read a ; read a b ; echo $a ; }
devtmpfs

More commonly you might want a loop

bash-4.2$ df | while read a
> do
> read a b
> echo $a
> done
devtmpfs
tmpfs
/dev/vda3
/dev/vdb

There's a secondary issue with bash and the right side of a pipeline being run a subshell, so the $a $b values aren't accessible outside of the while loop, but that's a different problem!

The question is about extracting a filesystem name from the output of df. Your “more common” solution extracts the first word from the 2nd, 4th, 6th, 8th, 10th, …, lines. That may be useful sometimes, but I don’t see how it’s relevant to this question. I suggest that, if you’re going to present that loop construct, you explain what it does (you may copy my explanation from this comment) and demonstrate what it does; e.g., ls -l | while read a; do read a b c d e f g h i; echo "$i"; done outputs file1 / file3 / file5 / … — G-Man Says 'Reinstate Monica', Nov 16 '21 at 21:57

Why must I put the command read into a subshell while using pipeline

3 Answers3