Note: As comments by Stéphane Chazelas suggest, this answer is somewhat invalidated by the existence of RegEx implementations that do allow an AND-Operator. The reasoning below is still correct in that such an operator only makes sense if you ensure that the imposed conditions are mutually compatible.
I think the answer is that there cannot be the "AND" equivalent of the |
-operator in RegExes, because in the end, regular expressions perform matching on the character level of the input string (albeit sometimes implicitly via repetition operators), and thereby directly tied to a particular position in the string (see e.g. this Q&A for a similar discussion).
The point is that if you have an expression of the form (I'm using explicitly awk
syntax here because of your question title)
$0 ~ /something(A|B)somethingelse/
this requires the string to have either A
or B
at the specific position immediately behind something
and before somethingelse
to match. The position requirement can be more dynamic if you have patterns with repetition operators, such as
$0 ~ /[a-f]+(A|B)[0-9]+/
but still, the point is that the occurence of either A
or B
is tied specifically to the position after the pattern consisting of only lowercase a
... f
(1) and before the pattern consisting of only digits 0
... 9
.
There cannot be a corresponding "AND" condition
$0 ~ /something(A&B)somethingelse/
because that would mean that the input string would have to contain A
as well as B
at the very same position - which obviously wouldn't work.
The only use case where an "AND" operator is useful is therefore in describing general properties of the string, where each of the desired properties can be expressed by a single RegEx, e.g. "the string must contain at least one A
and at least one B
regardless of their exact absolute and relative position", but that would again leave us at the &&
operator for combining multiple expressions, which you said you are not interested it, and of course the various alternative formulations of this workaround in @terdon's answer.
(1) in C collating order, at least
A&B
isA.*B|B.*A
- that is A followed by B or B followed by A which is exactly the same as yourA&B
which is A followed by B or B followed by A. – slebetman Feb 04 '23 at 09:56