0

I just can't ever work out whether I am supposed to be using globbing or regex with bash. My book on bash shell scripting is so confusing specifically because it doesn't clear this topic up and I never end up getting my understanding right. Let me give an example, it states the following:... The . (dot) character means "any single character." Thus, a.c matches all of abc, aac, aqc, and so on.

Ok great, I'm thinking he's wrong because this is regex, but the first thing I do, is test it anyway:

$ touch abc aac aqc
$ ls
aac  abc  aqc
$ ls a.c
ls: cannot access 'a.c': No such file or directory

I then go and google globbing, and come across this post called "globbing tutorial", and I'm thinking, right this is the one.

https://linuxhint.com/bash_globbing_tutorial/

I'm almost immediately thinking it's all wrong because half his "globbing" is done via grep, which uses BRE which isn't globbing. For example he states:

"$ is used to define the ending character"

This is wrong, because that's the regex meaning, and it's not globbing. So I test it:

$ ls
aac  abc  aqc
$ ls c$
ls: cannot access 'c$': No such file or directory

So his number 1 hit link on google is wrong as well. It's like there's no post that clarifies this topic either in books or online, so I need some help to define the difference between regex and globbing, with some absolute certainty.

1 Answers1

4

The only place where bash uses regexps is with the =~ operator of its [[ ... ]] construct, and it's POSIX extended regexps in that case:

if [[ abc =~ ^a.b$ ]]; then
  echo 'abc matches the ^a.b$ ERE'
fi

Everywhere else:

  • case abc in (a?b) echo 'abc matches the a?b glob pattern'; esac
  • [[ abc = a?b ]] && echo 'abc matches the a?b glob pattern'
  • printf '%s\n' a?b: actual globbing aka filename generation aka pathname expansion
  • printf '%s\n' "${var#a?b}" "${var%a?b}" "${var##a?b}" "${var%%a?b}" "${var/a?b/x}
  • compgen -G 'a?b' (same with complete).
  • help 'r??d'

That's shell wildcards aka glob patterns aka filename / fnmatch patterns.

Run info bash pattern to learn about those in bash specifically. info -n conditional bash will get you to the Conditional construct section inside which you'll find the description of [[ ... ]] and its =~ operator.

Other tools like grep, find, vim, perl, firefox can use either or both in varying contexts. Their documentation will tell you. Also beware, there are many flavours of both types of patterns. As a rule of thumb, glob patterns are typically used for matching filenames (like in shell globs or find's -name/-path) and regexp for arbitrary text matching.

ksh93 is a shell that can use regexps (basic, extended, perl-like or augmented) in its globs:

$ printf '%s\n' ~(E:^a.b$)
a=b
axb

In zsh, you can use regexps (extended or pcre) in its globs via the e glob qualifier:

$ printf '%s\n' *(e['[[ $REPLY =~ "^a.b$" ]]'])
a=b
axb
$ zmodload zsh/pcre
$ printf '%s\n' *(e['[[ $REPLY -pcre-match "^a.b\z" ]]'])
a=b
axb

(where \z being the PCRE equivalent of ERE $ as in PCRE $ matches at the end of the subject but also before a newline at the end of subject).

If you set the rematchpcre option (set -o rematchpcre), [[ =~ ]] uses PCRE instead of ERE there.