1

How to grep number of occurrence of two different words e.g. 'register' and 'evn' in a file on Linux ?

The output should be like following:

registered:20
Supriya
  • 31

5 Answers5

5

In case reversed output format (count first, word after) is also acceptable, this does it too and is easy to add more words:

tr -c '[:alpha:]' '\n' < /path/to/file | sort | uniq -c | grep -w 'register\|evn'
  • Counts each word occurrence, even if there are multiple occurrences in the same line.
  • Counts exact matches of the words, not including the suffixed variants.
manatwork
  • 31,277
4

Use awk

awk '/register/ {r++} /evn/ {e++} END {printf("register:%d\nevn:%d\n", r, e)}' /path/to/file 
McNisse
  • 601
3

You can calculate it separately:

$ word=register; count=`grep -o $word /path/to/file| wc -l`; echo $word:$count
$ word=evn; count=`grep -o $word /path/to/file| wc -l`; echo $word:$count
dchirikov
  • 3,888
  • 3
    You don't need to wc -l. grep -c gives the count directly. – McNisse Jan 09 '13 at 11:35
  • @McNisse Actually, you do because grep will only count line occurrences and there may be more than one occurrence of a word in a line. – mchid Oct 16 '17 at 13:41
1

example file ./filename:

registering evn register evn
evn register evn.register. register.evn evn evn register. 
evn register-evnt register

command:

echo register:$(grep -oP "(^|\s)\Kregister(?=\s|$)|(^|\s)\Kregister\.(?=\s|$)" ./filename | wc -l) && echo evn:$(grep -oP '(^|\s)\Kevn(?=\s|$)|(^|\s)\Kevn\.(?=\s|$)' ./filename | wc -l)

example output:

register:4
evn:6

This should accurately count only the words "register" and "evn" while omitting occurrences of words containing "register" and or "evn" such as "registering", "evnt", or "register-evn" for example.

This assumes that there are no special characters like dashes immediately following either word but will include these words if they are followed by a period at the end of a line or sentence.

This linked answer gave me the info I needed for the grep syntax.

mchid
  • 1,420
-1
word="registered"
echo $word:$( grep -wc $word /path/to/file )

Works with Bash/Ksh and GNU grep