1

I have code that does something like this:

#!/bin/sh
CONTENTS=$(cat "somefile")
RELEVANT_LINES=$(echo "$CONTENTS" | grep -E "SEARCHEXPR")
COUNT=$(echo "$RELEVANT_LINES" | wc -l)

I hit an annoyance that this code didn't output the same if there weren't any matches, compared to the correct output given by replacing the third line with :

COUNT=$(echo "$CONTENTS" | grep -E "SEARCHEXPR" | wc -l)

I eventually traced to the fact that when there weren't any matches, RELEVANT_LINES was being set to the empty string, and echo was outputting a one line empty string + \n for a line count of 1.

I tried using printf and echo -n in the 3rd line, but couldn't find the elegant workaround and ended up using COUNT=$(echo "$RELEVANT_LINES" | grep '0' | wc -l) (all lines contain a zero) to avoid having to regex filter the entire source file twice.

That can't be right, but I can't figure the correct fix.

I didn't drop to scripting with -eq '' because I wasn't sure it would be as robust and I prefer piping directly to wc for pure neatness.

Any hints how to get file content in a variable, to neatly distinguish zero vs one line after being filtered by grep? :)

Stilez
  • 1,261

1 Answers1

4

Using CONTENTS to hold the file just to echo it again is a bit redundant, you could just

lines=$(cat "somefile" | grep -E "SEARCHEXPR")

or rather

lines=$(grep -E "SEARCHEXPR" "somefile")

If you just want the count of matching lines, use grep -c

count=$(grep -c -E "SEARCHEXPR" "somefile")

The immediate issue you see is caused by the fact that echo outputs the newline always. In the case where you get at least one line from the command substitutions, this actually helps, since command substitutions remove trailing newlines. It also works with tailing empty lines, try this to see: x=$(echo foo; echo; echo); echo "$x".

If you want to process the lines in the shell script in some other way, in addition to counting, then storing the text in a variable might not be best. You could try

for line in $lines ; do 
    something with "$line"
done

But that has the usual issues with unquoted variables, i.e. filename globbing and word splitting on whitespace. One line containing "foo bar doo", would be seen as three, since by default, the spaces split too.

You might want to use a while read loop instead, but for that, a shell that supports process substitution might be good to have. See BashFAQ 001 and variable doesn't change after find & while and in bash, read after a pipe is not setting values .

ilkkachu
  • 138,973