1

I have a mixed wordlist as an input:

azert12345
a1z2e3r4t5
a1z2e3r455

The command line I have tried to execute:

cat file.txt | grep -E "[[:digit:]]{5}" --color

What do I want to accomplish:

Print only these words: "azert12345" and "a1z2e3r4t5", using grep with a pattern like I said before. Something like grep -E "[[:digit:]]{5}".

It is easy to print words like "azert12345" using grep -E "[[:alpha:]]{5}[[:digit:]]{5}" with a maximum number of digits of 5 and a maximum number of alphabetical characters as 5, but the problem is: How am I going to print the mixed ones like this one a1z2e3r4t5?

The "a1z2e3r4t5" is just an example the mount of data i should deal with is so much biger

This problem is driving me to crazy for 3 days, and it is not a homework. I'll start learning again more about linux commands. I need some help.

2 Answers2

2

IMHO this would be simpler in awk or perl, for the reasons outlined here: grep with logic operators (in particular, that there is no natural AND operator in grep). For example

awk 'gsub(/[a-z]/,"&") == 5 && gsub(/[0-9]/,"&") == 5' file

or

perl -ne 'print if tr/[a-z]// == 5 && tr/[0-9]// == 5' file

will print lines containing exactly 5 of each of the character sets.


If you insist on grep, then something like this might work:

grep -xE '([^a-z]*[a-z][^a-z]*){5}' file | grep -xE '([^0-9]*[0-9][^0-9]*){5}'
steeldriver
  • 81,074
1

Not using the right tools, see, but at least as an alternative:

while read i; do 
  foo=$(echo -n $i | sed 's/[a-z]//g' | wc -c) && bar=$(echo -n $i | sed 's/[0-9]//g' | wc -c)
  [[ $foo -eq 5 && $bar -eq 5 ]] && echo "$i  has five digits and five alphas" 
done < file

Delete the alphas, the remaning are the digits and count them. To be thorough, delete the digits, the remaining are the alphas, count them. Save each result in a variable:

foo=$(echo -n $i | sed 's/[a-z]//g' | wc -c) && bar=$(echo -n $i | sed 's/[0-9]//g' | wc -c)

If the variables are 5 characters long, then the string is five digits and five alphas:

[[ $foo -eq 5 && $bar -eq 5 ]] && echo "$i  has five digits and five alphas" 

Output:

azert12345  has five digits and five alphas
a1z2e3r4t5  has five digits and five alphas

Is this logic faulty?

  • every time i trie to understand the meaning of this one ([^a-z][a-z][^a-z]) i feel like my brain want to explode – user11535592 Aug 24 '19 at 17:45
  • @user11535592 Haha, I'm struggling with it right now. Ask steeldriver for clarification if he's kind to provide one. – schrodingerscatcuriosity Aug 24 '19 at 17:51
  • he helped me alot by giving me a fish i am glad he did and i apreciate that but he didn't teach us how he did it i think we all have to make that engine works as always i hope you find your answers to and good luck – user11535592 Aug 24 '19 at 17:54