7
#!/bin/sh

re="\/$"

if [ $1 =~ $re ]; then
        echo "${ATTENTION_PREFIX}$1 DIRECTORY MAY NOT CONTAIN A \"/\" OR LITERAL SLASH!${ATTENTION_POSTFIX}"
        exit 1
fi

Executing ./file.sh hello/ results in [: 29: hello: unexpected operator

It looks like that this regular expression method is incorrect for shell scripting.

030
  • 1,557

3 Answers3

10

The standard test command also known as [ doesn't have a =~ operator. Most shells have that command built-in nowadays.

The Korn shell introduced a [[...]] construct (not a [[ command) with alternate syntax and different parsing rules.

zsh and bash copied that to some extent with restrictions and many differences but that never was standardized, so shouldn't be used in portable sh scripts.

ksh93 always had a way to convert an extended regexp to its globs with:

pattern=${ printf %P "regexp"; }

And you could then do:

[[ $var = $pattern ]]

Later (in ksh93l in 2001) it also incorporated regular expressions in its globs like with the ~(E)regex syntax for extended regular expressions, so you can do:

[[ $var = ~(E)regex ]]

That kind of pattern matching only works with the [[...]] construct or case, not the [ command.

zsh added a regexp matching operator for both its [ command and [[...]] first in 2001 with a pcre module. The syntax was initially [ string -pcre-match regex ] or [[ string -pcre-match regex ]].

bash added a =~ operator in bash 3.0 (in 2004). Using extended regular expressions. That was added shortly after by ksh93 and zsh as well (again with differences).

ksh93 and bash-3.2 and above (when the compat31 option is not enabled) use quoting to escape regexp operator causing all sorts of confusion (and it's very buggy in ksh93) and meaning it can't be used the same way with the [ command there. zsh doesn't have that problem (quotes are used for shell quoting, and backslash to escape regex operator as usual), so the =~ operator works in zsh's [ command (though itself needs quoted since =foo is a filename expansion operator in zsh).

yash doesn't have [[...]] but its [ command has a =~ operator (using EREs) and works as you'd expect (like zsh's).

(2023 edit) Support for [[...]] was added in yash in 2.49 (2018) and =~ in there works like bash's with regards to quoting.

In any case, neither [[...]] nor =~ are POSIX operators and should not be used in sh scripts. The standard command to do regular expression matching on strings is expr:

if expr "x$var" : "x$regex" > /dev/null; then...

Note that expr regexps are anchored at the start, and you need that x trick to avoid problems with $var values that are expr operators. expr uses basic regexp, not extended regexps.

Most of the time however, you don't need regexp as simple shell pattern matching is enough for most cases:

case $var in
  (pattern) echo matches
esac
  • This works in a shell script and this answers the question. sudo sh test.sh w test.sh: (\w) echo matches results in matches Thank you. – 030 Jul 04 '14 at 11:49
  • Oh, but when i type sudo sh test.sh h then there is no match. The expectation is that (\w) should match all word characters. – 030 Jul 04 '14 at 11:54
  • 1
    @utrecht, why would you have such an expectation? \w is not mentioned in the sh man page or the POSIX spec for the standard shells. To check for alnum or underscore (which \w means in some regular expression syntaxes), it's case $var in ([[:alnum:]_]) ... – Stéphane Chazelas Jul 04 '14 at 12:00
  • Thank you. ([a-z]) works to match e.g. a and [0-9] works for single digits. – 030 Jul 04 '14 at 12:08
  • @utrecht, [a-z] only makes sense in C locales. If you intend to match any lowercase later, that should be [[:lower:]]. If you mean only the US ASCII characters from a to z and not the other ones like those in my first name, you may use either [[:lower:]] or [a-z] but after having fixed the locale to C (LC_ALL=C). In other locales (even in the US, nowadays, the locale is generally not C but UTF-8 based), [a-z] may match B, may match é but not ... – Stéphane Chazelas Jul 04 '14 at 12:15
  • Thank you for the explanation. I have tested [[:lower:]] and it works and I will use this from now on instead of [a-z] – 030 Jul 04 '14 at 13:31
6

Change #!/bin/sh to #!/bin/bash, and use double brackets instead:

if [[ $1 =~ $re ]]; then

This is the extended test command, as opposed to the (regular) test command. =~ can only be used with the [[ ... ]] version, and requires Bash 3.0 or later.

Josh Jolly
  • 1,541
5

In bash old test [ does not support regex. You must use new test [[ instead:

re="\/$"

if [[ $1 =~ $re ]]; then
        echo "${ATTENTION_PREFIX}$1 DIRECTORY MAY NOT CONTAIN A \"/\" OR LITERAL SLASH!${ATTENTION_POSTFIX}"
        exit 1
fi

You can see more here.

You'll need to change your #!/bin/sh shebang line to #!/bin/bash, as well.

cuonglm
  • 153,898