3

I was trying to list some hidden files in my home directory and I encountered a very odd behavior of grep command when combining with ls command.

  1. I executed ls -a on my home directory and got all the files including hidden files as expected.
  2. I wanted to list all the hidden files starting with 'xau' so I executed ls -a |grep -i .xau* and it also worked as expected.
  3. Then I executed ls -a |grep -i .x* in the same directory but it didn't list anything at all.
  4. I then mistakenly typed ls -a |grep -i .*x (note that this time wildcard character * and character 'x' have switched places) and the interesting thing is that it behaved like what I intended in step3. I tried the same thing with this command ls -a .*x and ls -a .*X but I get no such file or directory error.

enter image description here

enter image description here

I have added the actual text output here.

Some of you may ask why not just use ls -a .x* but the thing with grep is that it prints with the appropriate colors. So could anyone please explain this to me?

Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232

2 Answers2

8

You are suffering from premature glob expansion.

.xa* doesn't expand because it doesn't match anything in the current directory. (Globs are case sensitive.) However, .x* does match some files, so this gets expanded by the shell before grep ever sees it.

When grep receives multiple arguments, it assumes the first is the pattern and the remainder are files to search for that pattern.

So, in the command ls -a | grep -i .x*, the output of ls is ignored, and the file ".xsession-errors.old" is searched for the pattern ".xsession-errors". Not surprisingly, nothing is found.

To prevent this, put your special characters within single or double quotes. For example:

ls -a | grep -i '.x*'

You are also suffering from regex vs. glob confusion.

You seem to be looking for files that start with the literal string ".x" and are followed by anything—but regular expressions don't work the same as file globs. The * in regex means "the preceding character zero or more times," not "any sequence of characters" as it does in file globs. So what you probably want is:

ls -a | grep -i '^\.x'

This searches for files whose names start with the literal characters ".x", or ".X". Actually since there's only one letter you are specifying, you could just as easily use a character class rather than -i:

ls -a | grep '^\.[xX]'

The point is that regular expressions are very different from file globs.

If you just try ls -a | grep -i '.x*', as has been suggested, you will be very surprised to see that EVERY file will be shown! (The same output as ls -a directly, except placed on separate lines as in ls -a -1.)

How come?

Well, in regex (but not in shell globs), a period (.) means "any single character." And an asterisk (*) means "zero or more of the preceding character." So that the regex .x* means "any character, followed by zero or more instances of the character 'x'."

Of course, you are not allowed to have null file names, so every file name contains "a character followed by at least zero 'x's." :)


Summary:

To get the results you want, you need to understand two things:

  1. Unquoted special glob characters (including *, ?, [] and some others) will get expanded by the shell before the command you are running ever sees them, and
  2. Regular expressions are different from (and more powerful than) file globs.
Wildcard
  • 36,499
1

The problem is that you do not enclose * in quotes and make shell expand it before running a command. See:

$ touch .xau1 .xau2
# this won't work
$ ls -a | grep -i .xau*
# but this does
$ ls -a | grep -i ".xau*"
.xau1
.xau2
# this one too
$ ls -a | grep -i '.xau*'
.xau1
.xau2
# check what shell sees
$ ls -a | grep -i .xau* # now don't press enter but C-x * and you'll see this:
$ ls -a | grep -i .xau1 .xau2