Let's look at (rx "do[^:]")
expands to:
ELISP> (rx "do[^:]")
"do\\[\\^:]"
This is...not what you want. What does it match?
ELISP> (string-match (rx "do[^:]") "do[^:]t")
0 (#o0, #x0, ?\C-@)
What's going on here? From the rx documentation:
(rx &rest REGEXPS)
Translate regular expressions REGEXPS in sexp form to a regexp string.
REGEXPS is a non-empty sequence of forms of the sort listed below.
The following are valid subforms of regular expressions in sexp
notation.
STRING
matches string STRING literally.
So if we call rx
with the argument "do[^:]"
, Emacs will look for the string "do[^:]" literally. That's not what you want.
Later in the documentation:
‘(any SET ...)’
matches any character in SET .... SET may be a character or string.
SET may also be the name of a character class: ‘digit’,
‘control’, ‘hex-digit’, ‘blank’, ‘graph’, ‘print’, ‘alnum’,
‘alpha’, ‘ascii’, ‘nonascii’, ‘lower’, ‘punct’, ‘space’, ‘upper’,
‘word’, or one of their synonyms.
‘(not (any SET ...))’
matches any character not in SET ...
To search for any character that is not :
, we can use:
ELISP> (rx "do" (not (any ":")))
"do[^:]"
And this seems like it works:
ELISP> (string-match (rx "do" (not (any ":"))) "dot")
0 (#o0, #x0, ?\C-@)
But there's a slight issue:
ELISP> (string-match (rx "do" (not (any ":"))) "do")
nil
Let's look back at the generated regex:
ELISP> (rx "do" (not (any ":")))
"do[^:]"
This says: a letter d
, a letter o
, and then any character that is not a colon. That is, match something with at least three characters. Let's handle this.
Because we can match properly all strings that have at least three characters, the only case we need to worry about is the case with two characters. We can detect this by using string-end
.
‘string-end’, ‘eos’, ‘eot’
matches the empty string, but only at the end of the
string being matched against.
We'll look for the first two characters being "do", and then either the string ends, or it continues with a non-colon character
ELISP> (rx "do" (or string-end (not (any ":"))))
"do\\(?:\\'\\|[^:]\\)"
ELISP> (string-match (rx "do" (or string-end (not (any ":")))) "dots")
0 (#o0, #x0, ?\C-@)
ELISP> (string-match (rx "do" (or string-end (not (any ":")))) "do")
0 (#o0, #x0, ?\C-@)
Our non-matching case:
ELISP> (string-match (rx "do" (or string-end (not (any ":")))) "do:")
nil
This works. The only thing that might trip you up is that this regex matches anywhere in the string that matches the regex, even if do:
also appears:
ELISP> (string-match (rx "do" (or string-end (not (any ":")))) "do: A thing I do.")
14 (#o16, #xe, ?\C-n)
This can be handled differently, depending what you're looking for.