2

I guess it's best to start with an example:

> echo "[20-20:10]Something" | sed -r -e 's/^\[[0-9\:\-]+(.*)$/\1/' 
]Something
> echo "[20-20:10]Something" | sed -r -e 's/^\[[0-9\-\:]+(.*)$/\1/' 
-20:10]Something

The only difference is that I swapped : and - characters in character class of regex. So: does the order of characters matter in sed's regex's character classes? I doesn't seem to matter on different regex systems, like https://regex101.com/.

I cannot find anything about this behaviour on Google, but I would like to know more, because I want to be sure to know what my scripts do.

Tom
  • 123

2 Answers2

3

There are a few rules. The important one in this case is that - is a range operation so you can say a-f rather than abcdef inside a class. To include a - as a literal character it is simplest if it is the last character in the class, but it can be the first or either end of a range.

If you want to negate a set of characters then the first character must be ^. To include it as a literal then it mustn't be the first.

As ] ends a class there is a special case that allows it to be the first (or second if the first character is ^ to negate the class), so []abc] is a set of 4 characters, a b c or ].

icarus
  • 17,920
2

Yes it matter, as [0-9\:\-] matches any single character from the set of digits, backslash, colon, or dash, while [0-9\-\:] does not match a dash. In the second expression, the dash signifies a range between the backslash character and the backslash character (backslashes are literal is character classes), and the expression is equivalent to [0-9\:] (or, for that matter [\0-9:]).

The dash does not signify a range of characters if it's first (possibly after ^) or last in a character class.

Also note that sed deals with POSIX regular expressions, which I don't think the site that you link to explicitly supports (see Why does my regular expression work in X but not in Y?).

Kusalananda
  • 333,661
  • I see, thanks a lot. That means my problem/error was trying to escape the colon and dash which is allowed for some reason in https://regex101.com . – Tom Feb 25 '20 at 13:09