0

Take the pattern

[UGLER]*

Can the string UUG match against it? I mean to say, is repetition allowed?

2 Answers2

4

In principle yes, but that may depend on the regex flavor you are using. At the very least, BRE, ERE and PCRE will all match that string. The expression [UGLER]* means match 0 or more consecutive characters from the set of U,G,L,E or R.

You can test this for different regex types easily enough:

  • BRE

    $ echo UUG | grep '[UGLER]*'
    UUG
    
  • ERE

    $ echo UUG | grep -E '[UGLER]*'
    UUG
    
  • PCRE

    $ echo UUG | grep -P '[UGLER]*'
    UUG
    

Of course, since you are looking for zero or more, it will also match things you might not be expecting:

$echo "foobar" | grep  '[UGLER]*'
foobar

If the regex flavor you are using supports it, use the + instead of *. For example, with PCRE:

 $echo -e "UUG\nfoobar" | grep -P '[UGLER]*'
 UUG
 foobar
 $echo -e "UUG\nfoobar" | grep -P '[UGLER]+'
 UUG
jordanm
  • 42,678
terdon
  • 242,166
  • all regexp flavours support + - but some require you to write it as \+ – cas Sep 18 '13 at 03:19
  • @CraigSanders OK, thanks. I didn't want to generalize because for all I know there is some kind of obscure regex engine from the 60s that only works in lisp machines and has its own weird syntax ;). – terdon Sep 18 '13 at 03:20
  • 1
    @CraigSanders, no, standard BRE don't support \+, that's a GNUism. See there for more details. – Stéphane Chazelas Sep 18 '13 at 06:08
  • Stephane is (of course) right. For portability, instead of [UGLER]+, use [UGLER][UGLER]* (ie, one occurence, followed by 0, 1 or many occurence) – Olivier Dulac Sep 18 '13 at 08:46
  • @stephane - thanks. it's been so long since i used a non-GNU sed or grep that i'd completely forgotten that + was non-standard. I usually install GNU tools under /usr/local/ within hours or minutes of using non-GNU systems like *bsd or solaris, just to keep myself sane. – cas Sep 18 '13 at 12:03
  • 1
    @OlivierDulac, note that grep -E '[UGLER]+' and grep '[UGLER]\{1,\}' are standard, it's just grep '[UGLER]\+' that isn't. – Stéphane Chazelas Sep 18 '13 at 12:39
  • @StephaneChazelas: Thanks! I now wonder if I will not alias "grep" to "grep -E" ^^ I'll go read about the potential side effects... [iirc, changing interpretation of some things, but I need to know exactly which ones] – Olivier Dulac Sep 18 '13 at 14:33
1

Assuming that your pattern is a fileglob pattern and not a regexp, then yes it will match a filename called 'UUG'. The pattern will match any file starting with U, G, L, E, or R.

you can test this yourself with:

touch UUG
ls -l [UGLER]*

If the pattern is a regexp, then it will match ANY string, because you are matching against zero-or-more instances of [UGLER]. If you want to match 1-or-more rather than zero-or-more, then use + instead of *

cas
  • 78,579