1

The regex below is correct if I test it on regex test sites. But the code below does not accept any input. Everything I try is wrong.

   #!/bin/bash


    domainRegex="(?=^.{4,253}$)(^(?:[a-zA-Z0-9](?:(?:[a-zA-Z0-9\-]){0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,}$)"


    while [ "$domain" = "" ]
    do
            echo "Please provide domain:"
            read domain
    done

    until [[ $domain =~ $domainRegex ]]
    do
            echo "Enter valid domain:"
            read domain
    done
Jens
  • 352

1 Answers1

1

You're using features from Perl compatible regular expressions (PCRE). Namely, (?=...) and (?:...) aren't part of standard extended regular expressions which Bash uses.

But it seems to me you're only using the former (?=^.{4,253}$) to check the length of the string. If that's correct, it's easy to replace that with a direct test against the string length:

if [ "${#domain}" -lt 4 ] || [ "${#domain}" -gt 253 ]; then
    echo "Domain name is too short or too long"
fi

Then, (?:...) is easy, it's the equivalent of (...), except it doesn't capture. The extra captures don't matter in what the regex as a whole matches, so we can just drop the ?: from each opening parenthesis.

Also note that (at least in ERE) the backslash in [a-zA-Z0-9\-] matches a literal backslash. A dash can be matched by just putting it as the first or last character in the bracket group (in both PCRE and ERE): [a-zA-Z0-9-].

With those modifications, we get:

^([a-zA-Z0-9](([a-zA-Z0-9-]){0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,}$

See also: Why does my regular expression work in X but not in Y?

ilkkachu
  • 138,973
  • I didn't test that resulting regex too closely, though – ilkkachu Oct 24 '19 at 16:20
  • Generally, you can't use ranges in EREs for input validation unless you switch to the C locale. [a-z] matches thousands of characters depending on locale or systems and may even match sequences of several characters. For instance, in Hungarian locales on GNU systems, [[ dDZS =~ ^[a-z]$ ]] returns true. As dDZS is a collating element that sorts in between a and z. – Stéphane Chazelas Feb 24 '23 at 17:17