#!/bin/bash
INT=-5
if [[ "$INT" =~ ^-?[0-9]+$ ]]; then
echo "INT is an integer."
else
echo "INT is not an integer." >&2
exit 1
fi
What does the leading ~
do in the starting regular expression?
#!/bin/bash
INT=-5
if [[ "$INT" =~ ^-?[0-9]+$ ]]; then
echo "INT is an integer."
else
echo "INT is not an integer." >&2
exit 1
fi
What does the leading ~
do in the starting regular expression?
The ~
is actually part of the operator =~
, which performs a regular expression match of the string to its left to the extended regular expression on its right.
[[ "string" =~ pattern ]]
Note that the string should be quoted, and the regular expression shouldn't be quoted (unless you want to match literal strings).
A similar operator is used in the Perl programming language and several other general-purpose and domain-specific languages to perform regular expression matching.
The regular expressions understood by bash
are the same as those that GNU grep
understands with the -E
flag, i.e. the extended set of regular expressions.
Somewhat off-topic, but good to know:
When matching against a regular expression containing capturing groups, the part of the string captured by each group is available in the BASH_REMATCH
array. The zeroth/first entry in this array corresponds to &
in the replacement pattern of sed
's substitution command (or $&
in Perl), which is the bit of the string that matches the pattern, while the entries at index 1 and onwards correspond to \1
, \2
, etc. in a sed
replacement pattern (or $1
, $2
etc. in Perl), i.e. the bits matched by each parenthesis.
Example:
string=$( date +%T )
if [[ "$string" =~ ^([0-9][0-9]):([0-9][0-9]):([0-9][0-9])$ ]]; then
printf 'Got %s, %s and %s\n'
"${BASH_REMATCH[1]}" "${BASH_REMATCH[2]}" "${BASH_REMATCH[3]}"
fi
This may output
Got 09, 19 and 14
if the current time happens to be 09:19:14.
The REMATCH
bit of the BASH_REMATCH
array name comes from "Regular Expression Match", i.e. "RE-Match".
In non-bash
Bourne-like shells, one may also use expr
for limited regular expression matching (using only basic regular expressions).
A small example:
$ string="hello 123 world"
$ expr "$string" : ".*[^0-9]\([0-9][0-9]*\)"
123
You should read the bash man pages, under the [[ expression ]]
section.
An additional binary operator, =~, is available, with the same precedence as == and !=. When it is used, the string to the right of the operator is considered an extended regular expression and matched accordingly (as in regex(3)).
Long story short, =~
is an operator, just like ==
and !=
. It has nothing to do with the actual regex in the string to its right.
=~
in real life...?
– George Vasiliou
Jan 27 '17 at 06:44
man [[ expresssion ]]
and man [[
return nothing. help [[
returns useful information—since [[
an internal bash command—but does not say whether =~
uses basic or extended regex syntax. ⋯ The text you quoted is from the bash man page. I realize you said “read the bash man pages” but at first, I thought you meant read the man pages within bash. At any rate, man bash
returns a huge file, which is 4139 lines (72 pages) long. It can be searched by pressing /▒▒▒
, which takes a regex, the flavor of which—like =~
—is not specified.
– Alex Quinn
Jul 05 '19 at 20:10
grep -E
understands only on GNU systems and only when using an unquoted variable as the pattern[[ $var = $pattern ]]
(see[[ 'a b' =~ a\sb ]]
vsp='a\sb'; [[ 'a b' =~ $p ]]
). Also beware that shell quoting affects the meaning of RE operators and that some characters need to be quoted for the shell tokenising that may affect the RE processing.[[ '\' =~ [\/] ]]
returns false.ksh93
has even worse issues. Seezsh
(or bash 3.1) for a saner approach where shell and RE quoting are clearly separate. The[
builtin ofzsh
andyash
also have a=~
operator. – Stéphane Chazelas Jan 27 '17 at 10:06[[ "This is a fine mess." =~ T.........fin*es* ]]; [[ "This is a fine mess." =~ T.........fin\*es\* ]]
. Or that a quoted*
also match?[[ "This is a fine mess." =~ "T.........fin*es*" ]]
. – Jan 30 '17 at 00:28[[ a =~ .* ]]
or[[ a =~ '.*' ]]
or[[ a =~ \.\* ]]
, the same.*
RE is passed to the=~
operator. OTH, inbash
,[[ '\' =~ [)] ]]
returns an error, would you know without trying it whether[[ '\' =~ [\)] ]]
matches? How about[[ '\' =~ [\/] ]]
(it does in ksh93). How aboutc='a-z'; [[ a =~ ["$c"] ]]
(compare with the=
operator)? See also:[[ '\' =~ [^]"."] ]]
which returns false... Note that you can doshopt -s compat31
inbash
to get thezsh
behaviour. – Stéphane Chazelas Jan 30 '17 at 07:41zsh
/bash -o compat31
's behaviour for[[ a =~ '.*' ]]
is also consistent with[ a '=~' '.*' ]
(for[
implementations that support=~
) orexpr a : '.*'
. OTOH, it's not consistent with[[ a = '*' ]]
vs[[ a = * ]]
(but then, globs are part of the shell language, while REs are not). – Stéphane Chazelas Jan 30 '17 at 08:00pat="..."; if [[ "$string" =~ $pat ]]; then ...
. (@StéphaneChazelas's topmost comment suggested it, I'm just emphasizing it.) – dubiousjim Sep 30 '17 at 22:43