0

How can I find count of every word in a file?

I want a histogram of each word in text pipe or document.

I was already able to split up document into a list of words; so each word is on a new line. If you can get it working directly from text document then a solution from there is fine too.

> cat doc.txt 
word
second
third
word
really
> cat doc.txt | ... # then count occurrences of each word \
                      and print in descending order separated by delimiter
word 2
really 1
second 1
third 1

It needs to be somewhat efficient as file is 1GB text and cannot work with exponential time load.

user14492
  • 853

1 Answers1

5

Here's one way:

$ sort file | uniq -c | sort -nrk1 | awk '{print $2,$1}'
word 2
third 1
second 1
really 1
terdon
  • 242,166