22

The following command will tar all "dot" files and folders:

tar -zcvf dotfiles.tar.gz .??*

I am familiar with regular expressions, but I don't understand how to interpret .??*. I executed ls .??* and tree .??* and looked at the files which were listed. Why does this regular expression include all files within folders starting with . for example?

cjm
  • 27,160
SabreWolfy
  • 1,154
  • 1
    see also http://unix.stackexchange.com/questions/1168/how-to-glob-every-hidden-file-except-current-and-parent-directory – Lesmana Jul 23 '12 at 20:37

3 Answers3

32

Globs are not regular expressions. In general, the shell will try to interpret anything you type on the command line that you don't quote as a glob. Shells are not required to support regular expressions at all (although in reality many of the fancier more modern ones do, e.g. the =~ regex match operator in the bash [[ construct).

The .??* is a glob. It matches any file name that begins with a literal dot ., followed by any two (not necessarily the same) characters, ??, followed by the regular expression equivalent of [^/]*, i.e. 0 or more characters that are not /.

For the full details of shell pathname expansion (the full name for "globbing"), see the POSIX spec.

dhag
  • 15,736
  • 4
  • 55
  • 65
jw013
  • 51,212
  • 10
    Additional point: this is an attempt to write a glob which matches all of the dotfiles in a directory except the special entries . and .., which one normally does not want to do anything with. It's not quite right; it doesn't pick up anything named '.X' where X is some character other than dot. I don't think it's possible to write a single glob that matches every dotfile except . and .., but you can do it with two: tar zcvf dotfiles.tar.gz .[!.] .??* for instance. – zwol Jul 22 '12 at 21:45
  • @Zack: Thanks for the clarification. I posted a comment about that, but then deleted it. ls .? returned the same as ls .., which meant there were no other entries in the folder matching the pattern .?. I would have done .[^.] for all .? files other than ... – SabreWolfy Jul 23 '12 at 17:27
  • 2
    @SabreWolfy If you read the POSIX spec carefully, that's actually an important difference between globs and regex: in bracket expressions, [^abc] in regex syntax means the same as [!abc] in glob syntax (i.e. ^ is replaced with ! for globs). Using [^abc] style syntax in a glob is not very portable because POSIX does not specify what it means, so some shells interpret it using regex-like semantics while others treat ^ as just a literal character. – jw013 Jul 23 '12 at 17:31
  • 1
    @jw013: Thanks for the details. I must remember that glob != regexp :) – SabreWolfy Jul 23 '12 at 17:34
  • 1
    It just occurred to me that .[!.] may be left as a literal on the tar command line in the common case where there are no files that match that pattern. Some shells let you control that behavior, e.g. with bash, shopt -s nullglob will make it vanish from the command line if it doesn't match anything, but that's not a universal feature. – zwol Jul 23 '12 at 19:03
  • @Zack With bash you also get the dotglob and GLOBIGNORE features. With GLOBIGNORE=., a simple .* should do. – jw013 Jul 23 '12 at 19:27
9

The .??* wildcard (not a regular expression, though it looks that way) translates into filenames that start with a period (.) , followed by two single characters (??), and then any number (zero or more) of other characters (*).

Maybe this page on Wildcards in Filenames will be helpful.

Levon
  • 11,384
  • 4
  • 45
  • 41
-1

To add to the other answers, a single ? will translate to a single character filename and ?? will match filenames that has only two characters and so on.

[root@mercy testdir_2]# ls
ion  it  r
[root@mercy  testdir_2]# ls ?
r
[root@mercy  testdir_2]# ls ??
it
[root@mercy 1 testdir_2]# ls ???
ion
[root@mercy  testdir_2]#
Sreeraj
  • 5,062