1

Can you help with this regex in my sourceslist.nanorc ?

Regex:

cdrom:\[[a-zA-Z0-9\._-\(\) ]+\]/

Error:

Bad regex "cdrom:\[[a-zA-Z0-9\._-\(\) ]+\]/": Invalid range end

Thank you.

AdminBee
  • 22,803
bugyt
  • 113

2 Answers2

4

The problem is likely to be the placement of the - sign in your character list.

You have already used the fact that ranges of characters can be expressed by [start-end], as in [a-z] being shorthand for [abcdefghijkl...xyz] (although see the caveat below). That means that the - is a special character, and if it occurs between two "regular" characters, it is interpreted as indicating yet another range encompassing these two characters and every one in between.

Of course, this only works if the character after the - is lexicographically "later" in the sort order then the character preceding it, which is also the reason for your error message (you will see that it goes away if you say (-_ instead, although that will not solve your problem).

Since you obviously want to match the literal -, and depending on how regular expressions are interpreted in the .nanorc, you either

  • have to escape it (i.e. \-), or
  • place it first or last in the character list (i.e. [-etc] or [etc-]) which would be standard in POSIX and GNU regular expressions and therefore the most likely solution on a Linux system.

See e.g. here for further reference.

Caveat: The statement above "[a-z] being shorthand for [abcdefghijkl...xyz] is not unconditionally true! How the range is interpreted depends on the locale settings, specifically the collation order.

  • In the "C" locale, the order is according to ASCII code value, i.e. ABC...XYZ...abc...xyz. Here, [a-z] actually means "all lowercase characters".
  • In most other locales, upper- and lower-case-characters are grouped together, i.e. the order is aAbBcC...xXyYzZ. Here, [a-z] would mean "all lowercase characters and all uppercase characters except Z.
  • The treatment of non-ASCII characters like "umlauts" is yet another issue.

See here and here for further discussions on the subject.

AdminBee
  • 22,803
0

Pasting this at https://regexr.com/ helps debug better.

You did not escape - in the range, as - is used to specify range.

Old:

cdrom:\[[a-zA-Z0-9\._-\(\) ]+\]/

Correction:

cdrom:\[[a-zA-Z0-9\._\-\(\) ]+\]/