There is no way to build such regex with only ERE (extended regex).
The closer with GNU grep (perl regex) (which match 3 or more repeated characters) is:
grep -P '(\w)(((?!\1)\w)*\1){2}' filename
So, removing words with 4 or more repeats, you will get an answer:
grep -P '(\w)(((?!\1)\w)*\1){2}' filename |
grep -Pv '(\w)(((?!\1)\w)*\1){3}'
An alternative with GNU awk is:
awk '{
a=$1;
while (length(a)){
b=gensub(substr(a,0,1),"","g",a);
if(length(a)-length(b)==3){print $0;next};
a=b
}
}' filename
It works by removing all repeats of the first character, if the removal was of 3 characters then print it, else, remove the next first letter until there are no more characters to replace (an improvement is to test only if the remaining length is equal or bigger than the repeat required).
Assuming that you want to count A as equivalent to a, then filter your file with:
cat /usr/share/dict/words | tr [[:upper:]] [[:lower:]] > words
The two solutions are similar but not equal. The two differ on words like independence from the dictionary file generated above.
Yes, independence contains 3 n's but 4 e's. Depending of which is found first the word may be included or not. Awk solution is stable and will include words in which any character is repeated exactly 3 times. A regex solution is more slipery and will match in some conditions and not in others.
Additionally, the regex will match only word characters which do not include ' (and the file contains several words with that character).
In all, the number of lines matched is (1527 more with awk):
13758 awklist
12231 greplist
And, removing the ' (184 more with awk):
9236 awklist2
9052 greplist2
Should tastelessness teleconferencing teletypewriter teletypewriters tempestuousness timelessness tintinnabulation tintinnabulations tirelessness transcontinental transgressors transubstantiation (just to list a few) be rejected?
All do have exactly 3 of one character and four (or more) of another.
tattooists(4 a's and 4 t's, but no letter 3 times) but notzoologist(3 s's). Backreferences are NOT the way to go about it. – Sep 02 '19 at 04:13tattooists, no letter exactly 3 times as required in the OP. – Sep 02 '19 at 04:25grep -viP) will match 3 or more of the same alpha char. as has been mentioned multiple times. The version withgrep -viPexcludes any 4-or-more matches. – cas Sep 02 '19 at 06:41recklessnesshas exactly 3es and your final code doesn't match it. Read you own "finally, to match only words with exactly 3 of the same ...". Do you need some code the check your regexps against? – Sep 02 '19 at 07:52recklessnessalso has 4schars in it, so it is excluded. The OP hasn't specified how that conflict is to be resolved, or even if it IS a conflict, so I'm going with "can't have more than 3 of the same character". If you have a different interpretation, feel free to write your own answer. – cas Sep 02 '19 at 08:46bananashould be excluded too, because it has 3as, but only 2ns. Andpuppytoo, because it has only 1uand 1y. – Sep 02 '19 at 10:17gewürztraminer(3r's) oro'keeffe(with 3e's). I am not sure why .... – Sep 02 '19 at 22:53tastelessness teleconferencing teletypewriter teletypewriters tempestuousness timelessness tintinnabulation tintinnabulations tirelessness transcontinental transgressors transubstantiation(just to list a few) be rejected? All have exactly 3 of one character while also having 4 of some other character. – Sep 02 '19 at 23:00'triggers\b. The other words you listed - as already mentioned more than 3 of same character = exclude, even if there's exactly 3 of another. – cas Sep 02 '19 at 23:42scharacters. – cas Sep 02 '19 at 23:46