3

I am looking for something like thing-at-point for a regexp. For example, I want a function called regexp-at-point-p that will return the matched text if the text around the point matches the regexp with a looking back bound of the beginning of the current line. This seems tricky. I can't easily use re-search-backward, because it won't return a full match, it only matches up to the current point, and it doesn't always match a complicated regexp.

Here are the two regexps I tested for a #hashtag

simple regex: #[a-zA-Z]+

complicated regex: \(?:^\|[[:blank:]]\|[](),.:;[{}]\)#\{1\}\(?1:[^#[:digit:]]\(?:[^ #+[:punct:][:space:]]+[-_]?\)+\)\(?:[[:blank:]]\|$\|[[:punct:]]\)

with the point inside the #hashtag, I would expect regexp-at-point-p should return "#hashtag" for both of these regexps.

My current solution is this. It does not match multiline expressions before the current line, which is fine for me, but it works on both regexps above.

(defun regexp-at-point-p (regexp)
  "Return match if the text around point matches REGEXP."
  (save-excursion
    (let ((p (point))
      (lbp (line-beginning-position))
      (match))
      (while (and (not (setq match (looking-at regexp)))
          (>= (point) lbp))
    (backward-char))
      (when (and match
         (>= p (match-beginning 0))
         (<= p (match-end 0)))
    (message (match-string 0))))))

It works, but it doesn't seem like the right way to do this. I tried an alternate version that would go to the beginning of the line and use re-search-forward, but it had a similar form.

This version seems like it should work (and it does for the simple regexp, but not for the complicated one). A similar approach with looking-back has the same limitation.

(defun regexp-at-point-p (regexp)
  "Return match if the text around point matches REGEXP."
  (save-excursion
    (let* ((p (point))
       (lbp (line-beginning-position))
       (match (re-search-backward regexp lbp t 1)))
      (when match
    ;; re-search-backward can give partial matches up to point. This sets
    ;; match data to the whole pattern.
    (looking-at regexp))

      (when (and (>= p (match-beginning 0))
         (<= p (match-end 0)))
    (message (match-string 0))))))

Is there a more canonical way to do this? or is this just not a good way to check if point is on a regexp?

John Kitchin
  • 11,555
  • 1
  • 19
  • 41
  • The question is unclear (underspecified), IMO. What does `if the text around the point matches the regexp` mean? Define *"the text around the point"*. Starting where, ending where? How far back & forward? Do you skip over leading/trailing whitespace? – Drew Feb 07 '20 at 17:18
  • Maybe `looking-at` and `looking-back` would help, depending on what you're really trying to do? – Drew Feb 07 '20 at 17:18
  • Your version with `re-search-backward` starting at point cannot work in general since the full match must be **before point**. – Tobias Feb 07 '20 at 17:33
  • I added some clarification that the bounds are the current line. `looking-back` fails for me with the complicated regexp I described. `looking-at` is what I use, but this only works looking forward, so I have to move the point back a char at a time until it either matches, or hits the beginning of the line (my artificially chosen boundary). – John Kitchin Feb 07 '20 at 17:34
  • IMHO your method with `backward-char` is the only viable solution. The alternative would be to go to a user-defined boundary and `XXX-search-regexp` into the opposite direction until point is within the match. That method can always fail since a match near point and not including point can cover the start of a match including point and the covered match would not be found by the regexp search. (Try following experiment: Insert in `*scratch*` the sequence `1212121212` and search for `1212`. The regexp search only finds 2 matches. It could actually find 4 overlapping matches. – Tobias Feb 07 '20 at 18:06
  • Note that `looking-back` differs in [that respect](https://emacs.stackexchange.com/questions/55364/is-point-in-a-regexp#comment86610_55364) significantly from your problem. – Tobias Feb 07 '20 at 18:09
  • One particular difficult thing to tackle is that one does not know how long the match for a regexp can become without analyzing the regexp. So one does not know how far one needs to step backwards to get all potential matches. That makes the user-defined boundary mandatory. Especially since one always must treat the case without match. – Tobias Feb 10 '20 at 06:14

0 Answers0