Test if a variable has more that 4 digits in bash script

Question

i want to test if a variable has more than 4 digits something like this

#!/bin/bash
if [ $input has more than 4 digits ]; then 
     echo "  * Please only 4 digits" >&2
     echo""
else
   the other option
fi

input='foo 1 bar 2 baz 3' has 3 ASCII decimal digits and 8 hexadecimal digits. Should it be accepted? Do you want to consider only ASCII decimal digits (0123456789), or other kinds of decimal or non-decimal digits, like ¹, ², or decimal digits in other numeral systems (꧳, ꤁...)? — Stéphane Chazelas, Apr 10 '21 at 05:52

ilkkachu · Accepted Answer · 2021-08-11T12:01:19.547

If you care about the number of digits (and not the numerical value), you could match against a regex in Bash/Ksh/Zsh ^{(* see footnote on [[:digit:]])}:

#!/bin/bash
input=$1
re='^[[:digit:]]{1,4}$'
if [[ $input =~ $re ]]; then
    echo "'$input' contains 1 to 4 digits (and nothing else)"
else
    echo "'$input' contains something else"
fi

Or e.g. [[ $input =~ ^[[:digit:]]{5,}$ ]] to check for "5 or more digits (and nothing else)", etc.

Or in a pure POSIX shell, where you have to use case for the pattern match:

#!/bin/sh
input=$1
case $input in 
    *[![:digit:]]*) onlydigits=0;; # contains non-digits
    *[[:digit:]]*)  onlydigits=1;; # at least one digit
    *)              onlydigits=0;; # empty
esac
if [ $onlydigits = 0 ]; then
    echo "'$input' is empty or contains something other than digits"
elif [ "${#input}" -le 4 ]; then
    echo "'$input' contains 1 to 4 digits (and nothing else)"
else
    echo "'$input' contains 5 or more digits (but nothing else)"
fi

(You could put all the logic inside the case, but nesting an if there is somewhat ugly, IMO.)

_{Note that [[:digit:]] should match whatever the current locale's idea of "digits" is. That might or might not be more than the ASCII digits 0123456789. On my system, [[:digit:]] does not match e.g. ⁴ (superscript four, U+2074), but [0-9] does. Matching other "digits" might be a problem, esp. if you do arithmetic on the number in the shell. So, if you want to be stricter, use [0123456789] to accept just the ASCII digits.}

@LinuxSecurityFreak, what's wrong with that? It should give the locale's idea of what counts as a "digit", right? They didn't exactly say what digits they require, and that that's somewhat orthogonal to the issue of looking for 4 or more of some types of letters: it should be easy to change that to [0-9], or [0123456789] or [[:alpha:]] or whatever. — ilkkachu, Aug 11 '21 at 11:54

Stéphane Chazelas · Answer 2 · 2021-04-10T07:26:46.287

Here assuming you mean ASCII decimal digits only and not other sorts of decimal or non-decimal digits.

shopt -s extglob # enables a subset of ksh extended globs including *(...),
                 # +(...) and ?(...) but unfortunately not {4}(...)
d='[0123456789]' nd='[^0123456789]'
case $input in
  ( $d$d$d$d+($d)     ) echo made of more than 4 digits;;
  ( $d$d$d$d$d  ) echo contains more than 4 digits;;
  ( ""                ) echo empty;;
  ( ($nd)            ) echo does not contain any digit;;
  ( $nd*             ) echo no more than 4 digits but also contains non-digits;;
  ( $d?($d)?($d)?($d) ) echo made of 1 to 4 digits;;
  ( *                 ) echo should not be reached;;
esac

Beware that in bash and depending on the system and locale, [0-9] and [[:digit:]] may match a lot more than just 0123456789 so those should not be used for input validation (more on that in that answer to a different question here for instance).

Also beware that bash pattern matching works in very-surprising ways in multi-byte locales.

You'll find that for instance in a zh_CN.gb18030 Chinese locale, on input='1-©©' it will return no more than 4 digits but also contains non-digits as expected, but if you append a single 0x80 byte (input='1-©©'$'\x80'), it will return contains more than 4 digits.

It's for this kind of reason (and the fact that pattern matching has been known to have bugs in corner cases in many shells) that for input validation, it's better to use a positive matching where possible for the things you accept (rather than negative match for the things to reject)¹ hence the $d?($d)?($d)?($d) above even though it shouldn't be necessary as in theory at least, anything else should have been matched by earlier patterns.

^{¹ as an exception to that, one may need to consider the Bourne and Korn shell's misfeature whereby case $input in [x]) echo yes; esac matches on x but also on [x]!}

glenn jackman · Answer 3 · 2021-04-05T18:52:46.670

2

I'd do

#!/usr/bin/env bash
die () { echo "$*" >&2; exit 1; }
input=$1
[[ $input == +([[:digit:]]) ]] || die "only digits please"
(( input <= 9999 ))            || die "no more than 4 digits please"
echo "ok: $input"

edited Apr 05 '21 at 18:52

answered Apr 05 '21 at 17:40

glenn jackman

85,964

but then, the input could be 00001, or even 012345... – ilkkachu Apr 05 '21 at 17:48
I think the OP can decide if that's a problem. – glenn jackman Apr 05 '21 at 17:51
1

A bug: this script is vulnerable to invalid octal numbers: 0789 will emit errors – glenn jackman Apr 09 '21 at 15:52
To fix: would need to use 10#$input in the arithmetic expression. Simpler to check string length indeed. – glenn jackman Apr 09 '21 at 16:01

score 1 · Answer 4 · answered Apr 05 '21 at 15:51

1

If you want to examine the number of characters a variable, you can do this...

var="foo bar"
echo "var contains ${#var} characters"
Result:
var contains 7 characters

answered Apr 05 '21 at 15:51

Pourko

1,844

1

To be precise, note that in bash (and most other shells), ${#var} includes the number of characters but also the number of bytes not forming valid characters. For instance, in a en_US.UTF-8 locale, input=$'\xc3\xa9\x80' bash -c 'echo "${#input}"' ouputs 2 (1 é character made of two bytes, plus one 0x80 byte). – Stéphane Chazelas Apr 10 '21 at 09:49

score -1 · Answer 5 · edited Aug 11 '21 at 11:27

-1

This is another way:

#!/bin/bash
if test -z "$1"
then
    echo no digit supplied
elif grep -qE '[[:digit:]]{5}' <<< "$1"
then
    echo too many digits supplied
else
    echo number of digits ok
fi

edited Aug 11 '21 at 11:27

ilkkachu

138,973

answered Aug 11 '21 at 10:42

user486407

1

this would say "number of digits ok" for an input like abc too... But then yeah, we don't know what the asker wanted to do with inputs like that. – ilkkachu Aug 11 '21 at 11:29

Test if a variable has more that 4 digits in bash script

5 Answers5