I will be assuming that the input will always contain exactly two fields per line.
You may use the GNU datamash
utility to sort the data, group it by the first field, and calculate the sum of the second field for each group:
datamash -s -W --output-delimiter=: groupby 1 sum 2 <file
Here, the -s
sorts the input, -W
makes the utility treat any run of consecutive whitespace characters as a field delimiter, and --output-delimiter=:
sets the output delimiter to the :
character. The rest tells datamash
to group by the first field and to calculate the sum of the second field for each group.
Given the input in the question in the file called file
, this would produce the following output:
beta:5
score:9
something:3
You can solve this in any number of other ways too. The easiest computational solution would be to use awk
:
awk '{ sum[$1] += $2 } END { for (key in sum) printf "%s:%d\n", key, sum[key] }' file
Here, we use an associative array, sum
, to hold the sum for each of the strings in the first field. The END
block executes at the end of the input and outputs the calculated sums together with the strings.
Note that this solution also assumes that the first field is a single word containing no whitespace characters, as shown in the question.
Using a shell loop, reading the sorted lines from the original file, printing and resetting the sum of the second field whenever a new first field is encountered:
unset -v prev
sort file |
{
while read -r key value; do
if [ "$key" != "${prev-$key}" ]; then
# prev is set and different from $key
printf '%s:%d\n' "$prev" "$sum"
sum=0
fi
prev=$key
sum=$(( sum + value ))
done
if [ "${prev+set}" = set ]; then
printf '%s:%d\n' "$prev" "$sum"
fi
}
Related: Why is using a shell loop to process text considered bad practice?
mktemp
creates a file, it does not delete anything. Also, it seems that there is no need for temporary files to solve this exercise. – Kusalananda Apr 02 '22 at 17:25