8

In a very long line I'll summarize with:

(foo),(bar,baz(word,right),(end)

I want to print only:

      (bar,baz(word,right

To match the second parenthesis, I exclude the word that follows the third one:

$ grep -oP "\\(.*(?!word).*right"

but Bash interprets the exclamation mark:

-bash: !word: event not found

Protecting the exclamation mark with single quote fails with grep: missing )

$ grep -oP '\\(.*(?!word).*right'
$ grep -oP '\\((?!word)(.*right)'

Protecting the exclamation mark with backslash fails with grep: unrecognized character after (? or (?-

Any idea?

Note: -P is for Perl regex and -o is to print only the matching part of a line

5 Answers5

11

The rules are different for single quotes versus double quotes.

For the reason you show, double quotes can't be used reliably in bash, because there's no sane way to escape an exclamation mark.

$ grep -oP "\\(.*(?!word).*right"
bash: !word: event not found

$ grep -oP "\\(.*(?\!word).*right"
grep: unrecognized character after (? or (?-

The second is because bash passes through \! rather than ! to grep. Showing this:

$ printf '%s' "\!"
\!

When you tried single quotes, the double backslash doesn't mean an escaped backslash, it means two backslashes.

$ printf '%s' '\\(.*(?!word).*right'
\\(.*(?!word).*right

Inside single quotes, everything is literal, and there are no escapes, so the way to write the regular expression you're trying is:

$ grep -oP '\(.*(?!word).*right'
Mikel
  • 57,299
  • 15
  • 134
  • 153
3

If you want to match from the second open parentheses up until (but not including) the next closing parentheses:

grep -Po '\(.*?\K\([^)]*'

Or portably with sed:

sed -n 's/^[^(]*([^(]*\(([^)]*\).*/\1/p'

To match the right most ( that is not followed by word up to the rightmost right after that:

grep -Po '.*\K\((?!word).*right'
2

You can do it with simple awk:

$ echo '(foo),(bar,baz(word,right),(end)' | awk -F'),' '{print $2}'
(bar,baz(word,right
cuonglm
  • 153,898
0

Look for the comma parens before and aft instead:

grep -oP '(?<=,)\(.*(?=\),)'

Example

$ echo '(foo),(bar,baz(word,right),(end)' | grep -oP '(?<=,)\(.*(?=\),)'
(bar,baz(word,right

The lookahead and behind can only look for explicit strings, it cannot find things such as .*.

References

slm
  • 369,824
0

If for some reason you cannot use single quotes as suggested in Mikel's answer, you can temporarily turn off history expansion using set +H (turn it back on with set -H), as suggested by glenn jackman in comments.