5

I have a file containing the following text:

we are going to write something 1 2 3

wc tells me I have 1 line 9 words and 38 characters.

I'm looking to count the 26 letters only (a-z, no numbers or white spaces etc).

Here's my current solution:

grep -o [[:alpha:]] filename | wc -l

I really want to know if there is a "better" way to do this at the command line.

garethTheRed
  • 33,957

3 Answers3

3

I would delete all non-alpha characters using tr and count the number of resultant characters. Passing both the tr solution and your solution to bash's time built-in suggests the tr solution is about 5 times faster, at least on my system

tr -cd '[:alpha:]' <filename | wc -m
iruvar
  • 16,725
0

you can use awk for figure out this too !

awk '{c+=gsub(s,s)}END{print c}' s='[[:alpha:]]' filename
Mazdak
  • 933
  • While your solution clearly answers the question, one-to-two-line answers are often not that helpful. Consider expanding your answer to include documentation or futher explanation of your proposed solution. – HalosGhost Sep 20 '14 at 17:41
  • @HalosGhost. I disagree. Unless asked by the OP, a concise reply is perfectly acceptable. – fpmurphy Sep 21 '14 at 12:27
  • @fpmurphy1, what I wrote is a commonly held opinion on this site (and SE, at large), not a personal view. – HalosGhost Sep 21 '14 at 16:23
0

Try:

LC_ALL=C grep -o [[:alpha:]] | sort -u | wc -l

Change LC_ALL=C.UTF-8 or your locale to match your own language [a-zA-Z].

cuonglm
  • 153,898
  • No, that gives 12, not 26. Why the unique? (I assume the sort is to allow unique?) – Volker Siegel Sep 21 '14 at 08:55
  • Because that the OP wants. How many letters in alphabeta ( total 26 letters) – cuonglm Sep 21 '14 at 09:01
  • Oh... I was sure the 26 is refering to the simple count of characters in the input when non-letters are ignored. "wearegoingtowritesomething" are 26, right? Looks like it's a really bad example... – Volker Siegel Sep 21 '14 at 09:13