Note: Thank you to Jeff Schaller and steeldriver. But as neither posted as an answer, I'm not sure how to mark as solved. I now have a better understanding of pipes/subshells. I'm pretty sure I once knew this but it's been a long time since I tried anything complex in bash.
Both assigning the filtered result from awk to a variable and process substitution worked for me. My final code to read unsorted unique lines from stdin
:
while read -r FILE
do
...
done < <(awk '!x[$0]++')
More reading on process substitution for those who find this question looking for solution to a similar problem.
ORIGINAL QUESTION:
I searched the site, but I can't find an answer to my problem.
I'm building an array from stdin and need to filter for unique lines. To do this, I'm using awk '!x[$0]++'
which I've read is shorthand for:
awk 'BEGIN { while (getline s) { if (!seen[s]) print s; seen[s]=1 } }'
.
The filter works as desired but the issue is that the resulting array from the while read
loop is empty.
For example (using $list
as a surrogate for stdin
):
list=$'red apple\nyellow banana\npurple grape\norange orange\nyellow banana'
while read -r line; do
array[count++]=$line
done <<< "$list"
echo "array length = ${#array[@]}"
counter=0
while [ $counter -lt ${#array[@]} ]; do
echo ${array[counter++]}
done
produces:
array length = 5
red apple
yellow banana
purple grape
orange orange
yellow banana
But filtering $list
with awk:
list=$'red apple\nyellow banana\npurple grape\norange orange\nyellow banana'
awk '!x[$0]++' <<< "$list" | while read -r line; do
array[count++]=$line
done
echo "array length = ${#array[@]}"
counter=0
while [ $counter -lt ${#array[@]} ]; do
echo ${array[counter++]}
done
produces:
array length = 0
But the output of awk '!x[$0]++' <<< "$list"
appears fine:
red apple
yellow banana
purple grape
orange orange
I've tried examining each line in the while read
loop:
list=$'red apple\nyellow banana\npurple grape\norange orange\nyellow banana'
i=0
awk '!x[$0]++' <<< "$list" | while read -r line; do
echo "line[$i] = $line"
let i=i+1
done
and it appears fine:
line[0] = red apple
line[1] = yellow banana
line[2] = purple grape
line[3] = orange orange
What am I missing here?
In case it's important, I'm using bash 3.2.57:
GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin15) Copyright (C) 2007 Free Software Foundation, Inc.