To verify that $ch
is any one of the ASCII digits 0
, 1
, 2
, 3
, 4
, or 5
, use:
Do not use ranges such as [0-5]
for input validation as (depending on system and locale) that tends to include many other characters that happen to sort between 0 and 5 beside 012345 such as ٠١٢٣٤۰۱۲۳۴߀߁߂߃߄०१२३४০১২৩৪੦੧੨੩੪૦૧૨૩૪୦୧୨୩୪௦௧௨௩௪౦౧౨౩౪౸౹౺౻౼౽౾೦೧೨೩೪൦൧൨൩൪෦෧෨෩෪๐๑๒๓๔໐໑໒໓໔༠༡༢༣༤༪༫༬༭༳၀၁၂၃၄႐႑႒႓႔፩፪፫፬០១២៣៤៰៱៲៳៴᠐᠑᠒᠓᠔᥆᥇᥈᥉᥊᧐᧑᧒᧓᧔᧚᪀᪁᪂᪃᪄᪐᪑᪒᪓᪔᭐᭑᭒᭓᭔᮰᮱᮲᮳᮴᱀᱁᱂᱃᱄᱐᱑᱒᱓᱔⁰⁴₀₁₂₃₄⅐⅑⅒⅓⅔⅕⅖⅗⅘⅙⅛⅜⅟↉①②③④⑩⑪⑫⑬⑭⑮⑯⑰⑱⑲⑳⑴⑵⑶⑷⑽⑾⑿⒀⒁⒂⒃⒄⒅⒆⒇⒈⒉⒊⒋⒑⒒⒓⒔⒕⒖⒗⒘⒙⒚⒛⓪⓫⓬⓭⓮⓯⓰⓱⓲⓳⓴⓵⓶⓷⓸⓾⓿❶❷❸❹❿➀➁➂➃➉➊➋➌➍➓〇〡〢〣〤㉈㉉㉊㉋㉑㉒㉓㉔㉕㉖㉗㉘㉙㉚㉛㉜㉝㉞㉟㊱㊲㊳㊴㊵㊶㊷㊸㊹㊺㊻㊼㊽㊾㋀㋁㋂㋃㋉㋊㋋㍘㍙㍚㍛㍜㍢㍣㍤㍥㍦㍧㍨㍩㍪㍫㍬㍭㍮㍯㍰㏠㏡㏢㏣㏩㏪㏫㏬㏭㏮㏯㏰㏱㏲㏳㏴㏵㏶㏷㏸㏹㏺㏻㏼㏽㏾꘠꘡꘢꘣꘤꣐꣑꣒꣓꣔꤀꤁꤂꤃꤄꧐꧑꧒꧓꧔꧰꧱꧲꧳꧴꩐꩑꩒꩓꩔꯰꯱꯲꯳꯴01234
You could also use regex [[ $ch =~ ^[012345]$ ]]
but that has little advantage over using case
or [[...]]
's =
.
It could be useful to match on any integer decimal representation of a number in between 0 and 5 including -0
, 0004
, +5
which you could do with:
[[ $ch =~ ^(-0+|\+?0*[012345])$ ]]
Which is slightly shorter than the Korn-style:
[[ $ch = @(-+(0)|?(+)*(0)[012345]) ]]
And likely easier to read by people familiar with regexps.
Never use arithmetic operators of the [[...]]
construct (as in [[ $ch -ge 0 && $ch -le 5 ]]
) nor ((...))
(as in (( ch >= 0 && ch <= 5 ))
) for input validation as those introduce arbitrary command execution vulnerabilities. [ "$ch" -ge 0 ] && [ "$ch" -le 5 ]
doesn't have the problem in bash but would output errors upon incorrect numbers and would allow blanks around the numbers.
\b([0-5])\b
is a perl regexp (the default at https://regex101.com), that matches on any one of the 012345 characters preceded and followed by a word b
oundary, that is provided it's neither preceded nor followed by a word character, word characters being alphanumeric ones and underscores. So for instance it would match in 123.5
because there's a 5
in there that is preceded by .
which is not a word character and not followed by anything.
bash
's =~
uses POSIX extended regular expressions, not perl regexps and the behaviour for \b
in POSIX ERE is unspecified.
As https://regex101.com doesn't currently offer POSIX ERE as a choice of regex flavour, you shouldn't use it to validate regexps used in bash's [[ =~ ]]
operator.
There are systems in which the extended regular expression matcher used by bash supports \b
as an extension over the standard, but in [[ $ch =~ \b[0-5]\b ]]
, bash treats \b
as a quoted b
, the same as if you had written [[ $ch =~ 'b'[0-5]'b' ]]
and doesn't pass the backslash to the regex engine.
You can work around that by using:
regex='\b[012345]\b' # with the [0-5] also fixed to [012345]
[[ $ch =~ $regex ]]
Where the backslash will be passed to the regex matcher¹, but that will only work on systems that support that \b
extension.
Doing it with standard ERE syntax would look like:
[[ $ch =~ (^|[^[:alnum:]_])[012345]([^[:alnum:]_]|$) ]]
To use perl-style regexps, you could switch to zsh
which has a rematchpcre
option to use PCRE (PCRE2 for now PCRE3 in the next version) in its own =~
operator.
set -o rematchpcre
[[ $ch =~ '\b[0-5]\b' ]]
Would work there (and zsh doesn't have that misfeature of bash whereby shell quoting is treated as regexp escaping which also allows it to use other regexp engines).
zsh also has a glob operator to match ranges of decimal integer numbers so there [[ $ch = <0-5> ]]
would match on 000
, 01
, 3
... And [[ $ch = (-<0-0>|(+|)<0-5>) ]]
would do the same as [[ $ch =~ '^(-0+|\+?0*[012345])$' ]]
(note the quotes around the regex as a difference with bash 3.2+).
¹ See bash regexp matching fails in [[ ]] and How does storing the regular expression in a shell variable avoid problems with quoting characters that are special to the shell? for details and there for the history of builtin regex matching in Korn-like shells.
\b
there, even if the bash regex flavor supported it, since that would consider something likefoo 5 bar
valid input. Instead, you want^
and$
to make sure you are testing the entire string. – terdon Oct 08 '23 at 12:24