-1

I am using the following pattern matching condition to detect an error when the variable hn is not a valid numeric integer.

! [[ "$hn" =~ ^[[:digit:]]+$ ]] && errcode=13

But I get

bash: /home/flora/int.sh: line 1184: syntax error near unexpected token ! bash: /home/flora/int.sh: line 1184: if ! [[ "$hn" =~ ^[[:digit:]]+$ ]]; then'

Vera
  • 1,223
  • 2
    In what context? What are you expecting it to do? What input do you give it? What actually happens when you try it? Please [edit] your question and give us some context so we can understand what you need. – terdon Nov 19 '21 at 12:45
  • I am getting a bash error when I do if ! [[ "$hn" =~ ^[[:digit:]]+$ ]]; then. – Vera Nov 19 '21 at 12:53
  • I want to detect if the user made a mistake and the value of hn was not set to a positive integer. – Vera Nov 19 '21 at 12:54
  • 1
    Do not use [[:digit:]] for input validation, the list of characters that it matches varies from system to system and locale to locale. Use [0123456789] (not [0-9] which is often even worse). – Stéphane Chazelas Nov 19 '21 at 13:05
  • 3
    We need to know: i) what you are trying to test; ii) what happens when you try it; iii) we need a full example we can run on our machines to test any answers; iv) if you get an error, tell us what the error is, don't just say you get an error! – terdon Nov 19 '21 at 13:09
  • @StéphaneChazelas If [0-9] and [[:digit:]] would match on other characters, would that not mean that the implementation in wrong. Do we really need all these caveats, this does not work, the other does not do what one expects. It seems to me that computer technology today lacks skill. – Vera Nov 19 '21 at 13:53
  • @Aardvark, I personally think [0-9] should only match on 0123456789 as that's what it used to do, and changing it broke users expectations. But there is a case to be made for it to also match on `` for instance as it often does because there's a case that can be made for that character to sort somewhere between 7 and 8 and there's a case for ranges to be based on collation. Same for [[:digit:]] where one might argue there's no reason for that to be limited to Arabic decimal digits (as typically used in some languages like English) and not other decimal digits. – Stéphane Chazelas Nov 19 '21 at 14:01
  • Your error is likely caused by some problem further up the script such as a not properly closed case statement. shellcheck might help – Stéphane Chazelas Nov 19 '21 at 14:05
  • Sure, but then such implementations need to have the ability to distinguish between different decimal systems, because the operation should be equivalent, independently of which system you are using. – Vera Nov 19 '21 at 14:06
  • @StéphaneChazelas Your intuition was correct, the case was not properly closed with ;;. – Vera Nov 19 '21 at 14:10
  • I would normally keep the test itself positive, and invert the following operator: like [[ "$hn" =~ ^[[:digit:]]+$ ]] || ... – Paul_Pedant Nov 19 '21 at 23:21

1 Answers1

3
! [[ "$hn" =~ ^[[:digit:]]+$ ]] && errcode=13

is syntactically valid in bash. It's a cmd1 && cmd2 command list with cmd1 being ! pipeline (runs the pipeline and negates its status) and cmd2 being a scalar variable assignment. pipeline is the [[ ... ]] special compound command used to evaluate conditional expressions. Inside it, there are 3 properly delimited tokens. The second one is the =~ operator used for extended regexp matching. The 3rd one is taken literally as the regexp being used. As a regexp, it matches on one or more characters classified as digit in the locale following the beginning of the subject and followed by the end of the subject.

bash would still consider it valid even in POSIX mode.

In POSIX sh, the behaviour is unspecified, because [[ is reserved for some unspecified behaviour. In POSIX sh implementations that don't have a [[ special keyword, that would still be syntactically valid (except maybe for the unquoted $ which again may make you enter unspecified territory) though you'd likely get an error about a [[ command not being found.

The fact that you get a unexpected token ! in if ! [[ "$hn" =~ ^[[:digit:]]+$ ]]; then suggests that code is found in a context where a command is not expected. That could be for instance because it appears in the middle of a case statement where a pattern or esac is expected instead (where if would be accepted as the case pattern, but the next ! token would be unexpected as the parser would want a | or ) token there instead).

Now back to negating a regexp match. Yes, ! [[ "$var" =~ $regexp ]] works as ! can negate any command (actually, pipeline) including that one. You can also use the ! conditional expression operator: [[ ! "$var" =~ $regexp ]].

However note that while =~ looks like it's inspired from perl's =~ (itself likely inspired from awk's ~), perl's (or awk's) !~ operator is not supported.

As to that particular regexp as explained in more details in that other Q&A, things like [0-9] or [[:digit:]] should be avoided in general for input validation if you only intend to accept 0123456789 Arabic decimal digits.

So:

[[ ! "$hn" =~ ^[0123456789]+$ ]]

Or (using a glob pattern instead of a regexp):

[[ "$hn" != +([0123456789]) ]]

(that one needing shopt -s extglob in older version of bash, for that +(...) ksh extended glob to be recognised)

would be preferable.

In standard sh syntax:

case "$hn" in
  ('' | *[!0123456789]*) true;;
  (*) false;;
esac

You may also want to reject numbers with leading 0s (other than 0 itself) which are interpreted as octal in some contexts (causing errors in some for numbers like 019) or numbers larger than what is supported by whatever is going to use them.