0

I have a text file (or pipe output, doesn't matter here)

memcached.uptime 1061374
memcached.curr_connections 480
memcached.cmd_get 478962548
memcached.cmd_set 17641364
memcached.cmd_flush 0

If I use command cat test.txt | while read i; do echo $i; done it produces quite expected output:

memcached.uptime 1061374
memcached.curr_connections 480
memcached.cmd_get 478962548
etc

But if I loop over using for i in $(cat test.txt); do echo $i; done I see something different:

memcached.uptime
1061374
memcached.curr_connections
480
memcached.cmd_get
478962548
etc

The question is: WHY???

sourcejedi
  • 50,249
Putnik
  • 886
  • 3
    By default, a for loop iterates over individual whitespace-separated words, and a while-read loop iterates over lines. See http://mywiki.wooledge.org/BashFAQ/001 and http://mywiki.wooledge.org/DontReadLinesWithFor – glenn jackman Dec 07 '16 at 21:34
  • 1
    Also, very important, read this q&a: http://unix.stackexchange.com/q/171346/4667 – glenn jackman Dec 07 '16 at 21:36

2 Answers2

5

In:

cat test.txt | while read i; do echo $i; done

You managed to cram in quite a few shell scripting bad practices:

Though I should probably have first mentioned Why is using a shell loop to process text considered bad practice?.

Try for instance on an input like:

-n
/*/*/*/../../../*/*/*
  foo\
bar

If you did indeed need to use a shell loop, that would probably have to be something like:

{
while IFS= read <&3 -r i; do
  printf '%s\n' "$i" || exit
done
[ -z "$i" ] || printf %s "$i" || exit
} 3< test.txt

In

for i in $(cat test.txt); do echo $i; done

That replaces a bad practice with another. Here, you've got a good reason for leaving $(cat test.txt) unquoted: you want the split part of the split+glob operator, but you forgot to specify on what you want to split and to disable the glob part.

IFS='
' # split on newline only. The default value of $IFS
  # contains space, tab and newline which explains why you see
  # one word per line
set -o noglob # disable glob
for i in $(cat test.txt); do
  printf '%s\n' "$i" || exit
done

Note that would still skip empty lines, and that reads and stores the content of the file in memory (several times) before starting the loop.

agc
  • 7,223
  • Small side question: would quoting as in for i in "$(cat /etc/passwd)"; do echo "$i"; done be considered a sort of appropriate solution ? As far as I see, the "$(cat /etc/passwd)" provides all lines as one giant argument to for, so it kind of seems useless, at least as far as processing goes, or when one wants to process file line by line – Sergiy Kolodyazhnyy Dec 07 '16 at 22:08
  • @Serg, as I said, here you want to split, so you need to use the split+glob operator (leave $(...) unquoted), but you need to tune that split+glob. – Stéphane Chazelas Dec 07 '16 at 22:10
1

The answer is that $(command) expands to the raw output of the command, which your shell will then perform its usual word separation on. Said separation consists of any whitespace being considered a word separator.

You are also doing two different things with the text; in one you are parsing it through read (which works on line input and not word input, and in the other, you are iterating over the output of $(cat) in a for loop. You could probably get similar results with IFS='\n' for i in $(cat test.txt); do echo "$i"; done.

DopeGhoti
  • 76,081
  • 1
    The "DontReadLinesWithFor" link I pasted above discusses a few critical complications even if you set IFS. – glenn jackman Dec 07 '16 at 21:38
  • It's not the usual word separation in that it's completely different from the shell tokenisation. Here, we're talking of the split+glob operator, that's done dynamically based on the current value of $IFS that includes some blanks by default, but doesn't have to contains blanks. – Stéphane Chazelas Dec 07 '16 at 22:12
  • You can't prefix var=value to a compound-command such as for..do..done, only a simple-command. You can do it as a separate statement, as Stephane's answer does, but then you may need to worry about (saving and) reverting it before any subsequent use. Plus you can't use '\n' to get a newline, although in some shells you can use $'\n'. – dave_thompson_085 May 16 '19 at 04:23