0

I.e., \r is not working for grep in Extended Regular Expression mode?:

$ printf 'abcd\r\n' | grep -Ec 'd\r$'
0

$ printf 'abcd\r\n' | grep -c 'd.$' 1

$ printf 'abcd\r\n' | grep -Pc 'd\r$' 1

I thought \r is part of Extended Regular Expressions, like in https://valelab4.ucsf.edu/svn/3rdpartypublic/boost/libs/regex/doc/html/boost_regex/syntax/basic_extended.html. No?
Or it is indeed a limitation of grep?

xpt
  • 1,530

1 Answers1

1

No, \r is not part of standard basic nor extended regular expressions except in awk, though some greps support it as an extension like the grep from ast-open which supports it in all its regexp flavours (with -E, -X, -P and with the default BRE).

It's part of perl regular expressions though, as well as PCRE ones, so should be supported by grep implementations that support a -P for those.

Most shells these days support the $'...' form of quotes from ksh93, where \r is expanded to a carriage return. So with those, you can do:

grep $'d\r$'

PCRE allows specifying the type of line delimiter with directives such as (*LF), (*CRLF), (*CR), but that can't be used in grep -P even in those where perl-like matching is implemented using PCRE, because grep works on the contents of one (LF-delimited) line at a time, so the LF is not found in the string that the regexp matches against.

It could be used however in pcregrep's Multiline mode:

$ printf '%s\r\n' foo abcd bar | pcregrep -M '(*CRLF)d$' | sed -n l
abcd\r$

(sed -n l to reveal the the CR as \r).

With GNU grep, you could use it with the -z flag that makes it work on NUL-delimited records instead of lines:

$ printf '%s\r\n' foo abcd bar | grep -oPz '(*CRLF)(?m).*d$' | tr '\0' '\n' | sed -n l
abcd$

(also enabling the multiline flag for $ to match at the end of each line in addition to at the end of the record, and transliterating the NULs to LF on output for display).