2

I have the following string:

-----BEGIN 1_1 CERTIFICATE-----

The following sequence does not match the hyphens:

grep -- "[A-Z\-\_]" file

When I remove \_ the hyphens are matched:

grep -- "[A-Z\-]" file

If I remove the backslash I get grep: Invalid range end. Also it doesn't matter if I use -E/egrep or not - I get the same result.

What is the reason for that behavior?

manifestor
  • 2,473
  • Can you explain why you are using that pattern? Do you understand what it is searching for? Are you only trying to match hyphens? If so you need simply use: grep '-'. – jesse_b May 10 '18 at 17:37
  • I got it from a PHP framework - it's checking the chars in the variables of a HTTP request and if they match, it's letting the request pass. I modified it a little here of course. – manifestor May 10 '18 at 17:40
  • It will actually match any uppercase character and any underscore character. The brackets tell it to match any one of the characters inside. – jesse_b May 10 '18 at 17:43
  • Yes, I understand, but I why is "[A-Z\-\_]" not matching hyphens? – manifestor May 10 '18 at 17:47
  • Not sure but if you put it at the end it will: grep '[A-Z_-]' – jesse_b May 10 '18 at 17:50
  • 1
    Yes, I realized that already, take a look at my question once again :) That's what I want to know: Why is it matching the hyphen only if you put \- at the end. And why [A-Z\-\_] is not working. – manifestor May 10 '18 at 17:54
  • I understand your question, that's why I commented my workaround instead of posting it as an answer. Also the escape characters are unnecessary. – jesse_b May 10 '18 at 17:58
  • I appreciate your help and I was not intended to be impolite. Sorry if it sounded like that :) Removing the escape chars, leads to grep: Invalid range end on my system. – manifestor May 10 '18 at 18:05
  • It will do that when the hyphen is in the middle because it thinks you're trying to create a range. (For example A-Z is a range specifying any uppercase letter between a and z). – jesse_b May 10 '18 at 18:16

1 Answers1

9

When matching hyphens with a [...], the hyphen needs to be first or last within it:

grep '[A-Z_-]' ...

If you put the hyphen anywhere else, it will be taken as specifying a range.

Also, \ is literal in [...] (if the expression as a whole is quoted in the shell), so [\-] matches a backslash or a hyphen, and [\-_] probably matches a \, ], ^ or _ (these are the characters in the range from \ to _ in the ASCII table).

Kusalananda
  • 333,661