I have a variable which contains multiline output of a command. What's the most effecient way to read the output line by line from the variable?
For example:
jobs="$(jobs)"
if [ "$jobs" ]; then
# read lines from $jobs
fi
I have a variable which contains multiline output of a command. What's the most effecient way to read the output line by line from the variable?
For example:
jobs="$(jobs)"
if [ "$jobs" ]; then
# read lines from $jobs
fi
You can use a while loop with process substitution:
while read -r line
do
echo "$line"
done < <(jobs)
An optimal way to read a multiline variable is to set a blank IFS
variable and printf
the variable in with a trailing newline:
# Printf '%s\n' "$var" is necessary because printf '%s' "$var" on a
# variable that doesn't end with a newline then the while loop will
# completely miss the last line of the variable.
while IFS= read -r line
do
echo "$line"
done < <(printf '%s\n' "$var")
Note: As per shellcheck sc2031, the use of process substition is preferable to a pipe to avoid [subtly] creating an subshell.
Also, please realize that by naming the variable jobs
it may cause confusion since that is also the name of a common shell command.
To process the output of a command line by line (explanation):
jobs |
while IFS= read -r line; do
process "$line"
done
If you have the data in a variable already:
printf %s "$foo" | …
printf %s "$foo"
is almost identical to echo "$foo"
, but prints $foo
literally, whereas echo "$foo"
might interpret $foo
as an option to the echo command if it begins with a -
, and might expand backslash sequences in $foo
in some shells.
Note that in some shells (ash, bash, pdksh, but not ksh or zsh), the right-hand side of a pipeline runs in a separate process, so any variable you set in the loop is lost. For example, the following line-counting script prints 0 in these shells:
n=0
printf %s "$foo" |
while IFS= read -r line; do
n=$(($n + 1))
done
echo $n
A workaround is to put the remainder of the script (or at least the part that needs the value of $n
from the loop) in a command list:
n=0
printf %s "$foo" | {
while IFS= read -r line; do
n=$(($n + 1))
done
echo $n
}
If acting on the non-empty lines is good enough and the input is not huge, you can use word splitting:
IFS='
'
set -f
for line in $(jobs); do
# process line
done
set +f
unset IFS
Explanation: setting IFS
to a single newline makes word splitting occur at newlines only (as opposed to any whitespace character under the default setting). set -f
turns off globbing (i.e. wildcard expansion), which would otherwise happen to the result of a command substitution $(jobs)
or a variable substitution $foo
. The for
loop acts on all the pieces of $(jobs)
, which are all the non-empty lines in the command output. Finally, restore the globbing and IFS
settings to values that are equivalent to the defaults.
local IFS=something
. It won't affect the global-scope value. IIRC, unset IFS
doesn't get you back to the default (and certainly doesn't work if it wasn't the default beforehand).
– Peter Cordes
Aug 31 '15 at 01:24
set
in the way shown in the last example is correct. The code snippet assumes that set +f
was active at the begin, and therefore restores that setting at the end. However, this assumption might be wrong. What if set -f
was active at the beginning?
– Binarus
Aug 21 '20 at 06:43
set -f
, save the original $-
. For IFS
, it's annoyingly fiddly if you don't have local
and you want to support the unset case; if you do want to restore it, I recommend enforcing the invariant that IFS
remains set.
– Gilles 'SO- stop being evil'
Aug 21 '20 at 07:32
local
would indeed be the best solution, because local -
makes the shell options local, and local IFS
makes IFS
local. Unfortunately, local
is only valid within functions, which makes code restructuring necessary. Your suggestion to introduce the policy that IFS
is always set also sounds very reasonable and solves the biggest part of the problem. Thanks!
– Binarus
Aug 21 '20 at 08:06
local
doesn't exist in all sh variants, and local -
requires a very recent bash.
– Gilles 'SO- stop being evil'
Aug 21 '20 at 10:55
jobs="$(jobs)"
while IFS= read -r line
do
echo "$line"
done <<< "$jobs"
References:
-r
is a good idea too; It prevents \\
interpretation... (it is in your links, but its probably worth mentioning, just to round out your IFS=
(which is essential to prevent losing whitespace)
– Peter.O
Mar 21 '11 at 15:33
while read
construct.
– Mladen B.
Feb 12 '21 at 08:44
Problem: if you use while loop it will run in subshell and all variables will be lost. Solution: use for loop
# change delimiter (IFS) to new line.
IFS_BAK=$IFS
IFS=$'\n'
for line in $variableWithSeveralLines; do
echo "$line"
# return IFS back if you need to split new line by spaces:
IFS=$IFS_BAK
IFS_BAK=
lineConvertedToArraySplittedBySpaces=( $line )
echo "{lineConvertedToArraySplittedBySpaces[0]}"
# return IFS back to newline for "for" loop
IFS_BAK=$IFS
IFS=$'\n'
done
# return delimiter to previous value
IFS=$IFS_BAK
IFS_BAK=
while read
loop in bash means the while loop is in a subshell, so variables aren't global. while read;do ;done <<< "$var"
makes the loop body not a subshell. (Recent bash has an option to put the body of a cmd | while
loop not in a subshell, like ksh has always had.)
– Peter Cordes
Aug 31 '15 at 01:30
IFS
correctly. This solution has a problem as well: What if IFS
is not set at all in the beginning (i.e. is undefined)? It will be defined in every case after that code snippet; this doesn't seem to be correct.
– Binarus
Aug 21 '20 at 06:53
In recent bash versions, use mapfile
or readarray
to efficiently read command output into arrays
$ readarray test < <(ls -ltrR)
$ echo ${#test[@]}
6305
Disclaimer: horrible example, but you can prolly come up with a better command to use than ls yourself
readarray
in a function and call the function a few times.
– Eugene Yarmash
Mar 23 '11 at 08:16
The common patterns to solve this issue have been given in the other answers.
However, I'd like to add my approach, although I am not sure how efficient it is. But it is (at least for me) quite understandable, does not alter the original variable (all solutions which use read
must have the variable in question with a trailing newline and therefore add it, which alters the variable), does not create subshells (which all pipe-based solutions do), does not use here-strings (which have their own issues), and does not use process substitution (nothing against it, but a bit hard to understand sometimes).
Actually, I don't understand why bash
's integrated REs are used so rarely. Perhaps they are not portable, but since the OP has used the bash
tag, that won't stop me :-)
#!/bin/bash
function ProcessLine() {
printf '%s' "$1"
}
function ProcessText1() {
local Text=$1
local Pattern=$'^([^\n]\n)(.)$'
while [[ "$Text" =~ $Pattern ]]; do
ProcessLine "${BASH_REMATCH[1]}"
Text="${BASH_REMATCH[2]}"
done
ProcessLine "$Text"
}
function ProcessText2() {
local Text=$1
local Pattern=$'^([^\n]\n)(.)$'
while [[ "$Text" =~ $Pattern ]]; do
ProcessLine "${BASH_REMATCH[1]}"
Text="${BASH_REMATCH[2]}"
done
}
function ProcessText3() {
local Text=$1
local Pattern=$'^([^\n]\n?)(.)$'
while [[ ("$Text" != '') &&
("$Text" =~ $Pattern) ]]; do
ProcessLine "${BASH_REMATCH[1]}"
Text="${BASH_REMATCH[2]}"
done
}
MyVar1=$'a1\nb1\nc1\n'
MyVar2=$'a2\n\nb2\nc2'
MyVar3=$'a3\nb3\nc3'
ProcessText1 "$MyVar1"
ProcessText1 "$MyVar2"
ProcessText1 "$MyVar3"
Output:
root@cerberus:~/scripts# ./test4
a1
b1
c1
a2
b2
c2a3
b3
c3root@cerberus:~/scripts#
A few notes:
The behavior depends on what variant of ProcessText
you use. In the example above, I have used ProcessText1
.
Note that
ProcessText1
keeps newline characters at the end of linesProcessText1
processes the last line of the variable (which contains the text c3
) although that line does not contain a trailing newline character. Because of the missing trailing newline, the command prompt after the script execution is appended to the last line of the variable without being separated from the output.ProcessText1
always considers the part between the last newline in the variable and the end of the variable as a line, even if it is empty; of course, that line, whether empty or not, does not have a trailing newline character. That is, even if the last character in the variable is a newline, ProcessText1
will treat the empty part (null string) between that last newline and the end of the variable as a (yet empty) line and will pass it to line processing. You can easily prevent this behavior by wrapping the second call to ProcessLine
into an appropriate check-if-empty condition; however, I think it is more logical to leave it as-is.ProcessText1
needs to call ProcessLine
at two places, which might be uncomfortable if you would like to place a block of code there which directly processes the line, instead of calling a function which processes the line; you would have to repeat the code which is error-prone.
In contrast, ProcessText3
processes the line or calls the respective function only at one place, making replacing the function call by a code block a no-brainer. This comes at the cost of two while
conditions instead of one. Apart from the implementation differences, ProcessText3
behaves exactly the same as ProcessText1
, except that it does not consider the part between the last newline character in the variable and the end of the variable as line if that part is empty. That is, ProcessText3
will not go into line processing after the last newline character of the variable if that newline character is the last character in the variable.
ProcessText2
works like ProcessText1
, except that lines must have a trailing newline character. That is, the part between the last newline character in the variable and the end of the variable is not considered to be a line and is silently thrown away. Consequently, if the variable does not contain any newline character, no line processing happens at all.
I like that approach more than the other solutions shown above, but probably I have missed something (not being very experienced in bash
programming, and not being interested in other shells very much).
You can use <<< to simply read from the variable containing the newline-separated data:
while read -r line
do
echo "A line of input: $line"
done <<<"$lines"
while IFS= read
.... If you want to prevent \ interpretation, then useread -r
– Peter.O Mar 21 '11 at 15:41echo
toprintf %s
, so that your script would work even with non-tame input. – Gilles 'SO- stop being evil' Mar 21 '11 at 20:57/tmp
directory to be writable, as it relies on being able to create a temporary work file. Should you ever find yourself on a restricted system with/tmp
being read-only (and not changeable by you), you will be happy about the possibility of using an alternate solution, e. g. with theprintf
pipe. – syntaxerror Dec 04 '14 at 22:07printf "%s\n" "$var" | while IFS= read -r line
– David H. Bennett Jan 15 '15 at 03:15read
based solutions can't work cleanly, unless wrapped into a lot of additional ugly code. – Binarus Aug 21 '20 at 06:38