6

I'm confused as to why this does not match:

expr match Unauthenticated123 '^(Unauthenticated|Authenticated).*'

it outputs 0.

stracktracer
  • 163
  • 1
  • 1
  • 3

3 Answers3

7

Your command should be:

expr match Unauthenticated123 'Unauthenticated\|Authenticated'

If you want the number of characters matched.

To have the part of the string (Unauthenticated) returned use:

expr match Unauthenticated123 '\(Unauthenticated\|Authenticated\)'

From info coreutils 'expr invocation':

`STRING : REGEX'
     Perform pattern matching.  The arguments are converted to strings
     and the second is considered to be a (basic, a la GNU `grep')
     regular expression, with a `^' implicitly prepended.  The first
     argument is then matched against this regular expression.
 If the match succeeds and REGEX uses `\(' and `\)', the `:'
 expression returns the part of STRING that matched the
 subexpression; otherwise, it returns the number of characters
 matched.

 If the match fails, the `:' operator returns the null string if
 `\(' and `\)' are used in REGEX, otherwise 0.

 Only the first `\( ... \)' pair is relevant to the return value;
 additional pairs are meaningful only for grouping the regular
 expression operators.

 In the regular expression, `\+', `\?', and `\|' are operators
 which respectively match one or more, zero or one, or separate
 alternatives.  SunOS and other `expr''s treat these as regular
 characters.  (POSIX allows either behavior.)  *Note Regular
 Expression Library: (regex)Top, for details of regular expression
 syntax.  Some examples are in *note Examples of expr::.

outis
  • 113
Lambert
  • 12,680
5

Note that both match and \| are GNU extensions (and the behaviour for : (the match standard equivalent) when the pattern starts with ^ varies with implementations). Standardly, you'd do:

expr " $string" : " Authenticated" '|' " $string" : " Unauthenticated"

The leading space is to avoid problems with values of $string that start with - or are expr operators, but that means it adds one to the number of characters being matched.

With GNU expr, you'd write it:

expr + "$string" : 'Authenticated\|Unauthenticated'

The + forces $string to be taken as a string even if it happens to be a expr operator. expr regular expressions are basic regular expressions which don't have an alternation operator (and where | is not special). The GNU implementation has it as \| though as an extension.

If all you want is to check whether $string starts with Authenticated or Unauthenticated, you'd better use:

case $string in
  (Authenticated* | Unauthenticated*) do-something
esac
2

$ expr match "Unauthenticated123" '^\(Unauthenticated\|Authenticated\).*' you have to escape with \ the parenthesis and the pipe.

netmonk
  • 1,870
  • 1
    and the ^ may not mean what some would think depending on the expr. it is implied anyway. – mikeserv Dec 14 '15 at 14:18
  • 1
    @mikeserv, match and \| are GNU extensions anyway. This Q&A seems to be about GNU expr anyway (where ^ is guaranteed to mean match at the beginning of the string). – Stéphane Chazelas Dec 14 '15 at 14:34
  • @StéphaneChazelas - i didn't know they were strictly GNU. i think i remember them being explicitly officially unspecified - but i don't use expr too often anyway and didn't know that. thank you. – mikeserv Dec 14 '15 at 14:49
  • 1
    It's not "strictly GNU" - it's present in a number of historical implementations (even System V had it, undocumented, though it didn't have the others like substr/length/index), which is why it's explicitly unspecified. I can't find anything about \| being an extension. – Random832 Dec 14 '15 at 16:13