How can I find count of every word in a file?
I want a histogram of each word in text pipe or document. New line and empty lines will exist in document. I stripped everything except for [a-zA-Z]
.
> cat doc.txt
word second third
word really
> cat doc.txt | ... # then count occurrences of each word
# and print in descending order separated by delimiter
word 2
really 1
second 1
third 1
It needs to be somewhat efficient as file is 1GB text and cannot work with exponential time load.
Unix-system
one word or two? What aboutGNU/Linux
? – Kusalananda Aug 12 '20 at 16:42[a-zA-Z]
so hypen cannot exist; only letters small and capital case :) What about GNU/Linux. I chose to tag macOS because I want to make sure that people assume macOS flavor of tools and only expect macOS default tools to exist (installing a separate tool is overkill imo). – user14492 Aug 12 '20 at 20:22GNU/Linux
I meant to ask whether it was to be counted as one or two words, but by deleting the non-letter/
it's clear that it's a single word. – Kusalananda Aug 12 '20 at 21:21