0

I need to search a text file for a word count. The file contains lines of sentences and I only care about the number of times a word shows up not the number of lines. How do I tell grep to search for words instead of lines?

For instance if I use, grep -c '^ab' (words that start with ab), it only returns the number of lines that begin with ab, not the number of words that begin with ab.

ilkkachu
  • 138,973

2 Answers2

1

If you want to count words in file.txt, not lines, simply put each word on its own line:

tr " " "\n" file.txt | grep -c '^ab'
waltinator
  • 4,865
1

With GNU grep you can use the -o flag to get all the matches, and then count them afterwards wc -l:

grep -o '\<ab' file.txt | wc -l

Or I suppose you could count with grep itself:

grep -o '\<ab' file.txt | grep -c ''

("\<" means "start of a word".)

frabjous
  • 8,691
  • spectacular, if I wanted to include words that began with 'a' but ended with 'b', how would that look? – magnus reeves Mar 30 '22 at 17:32
  • Assuming a "word" can only have letters in it you could use \<a[A-Za-z]*b\>; if what you count as "words" can have other things in them like hyphens or underscores or digits, you may need to add to what's in the brackets. – frabjous Mar 30 '22 at 17:48
  • thank u so much, would u mind sharing me a resource where I can find other documentation on the -o regix? I have a billion more question I dont want to bug you with. I think its called string matching regix? Sorry I'm brand new to this in school I don't know if my questions make sense – magnus reeves Mar 30 '22 at 18:06
  • -o doesn't use a different kind of regex; it is just an option for grep which makes it so it only outputs the matches rather than the entire lines containing the matches (grep's normal behavior), and if there is more than one match on the same line, it puts them on separate lines in the output. See the man page for GNU grep (or man grep in the terminal). – frabjous Mar 30 '22 at 18:25