2

Let's say I have a language which looks like this:

anchor (item something item some thing item some item thing) item item

and that I'm playing with font-lock a bit. I would like to highlight "anchor" as being a function and every "item" in the parens as well.

I tried the following:

(setq test-font-lock-keywords
      `(
        ("\\<anchor\\>" (0 font-lock-function-name-face)
         ("\\<item\\>" nil nil (0 font-lock-constant-face))
         )
        )
)

(define-derived-mode test-mode fundamental-mode
  "test mode"
  "Test mode"
  (kill-all-local-variables)
  (interactive)
  (setq font-lock-defaults '(test-font-lock-keywords))
  )

My problem is that it considers every "item" word that comes after an "anchor" until the end of the line. The words "item item" in the end of my example should thus not receive syntax highlighting. There may be a way to achieve this with a post-form but I'm not sure that's how it's meant to be used or how to do it anyway.

So I tried with the following, much less sexy regular expression:

(setq regexp "\\(\\<anchor\\>\\)[\t ]*(\\(?:\\(item\\)\\|[^)]\\)*)")

(setq test-font-lock-keywords
      `(
        (,regexp (1 font-lock-function-name-face) (2 font-lock-constant-face))
       )
)

And it only highlights the last found "item" before the parens. How can I make this regex capture every "item" here represented with index 2?

1 Answers1

3

There is a little-known secret hidden in the pre-match form. This is from the built-in help of the font-lock-keywords variable:

if PRE-MATCH-FORM returns a position greater than the position after PRE-MATCH-FORM is evaluated, that position is used as the limit of the search.

Below, the pre-match form set the end of the parentesis following anchor as the end of the search:

'(("\\_<anchor *("
   (0 font-lock-function-name-face)
   ("\\_<\\item\\_>"
    ;; Pre-match form -- limit the sub-search to the end of the argument list.
    (save-excursion
      (goto-char (match-end 0))
      (backward-char)
      (ignore-errors
        (forward-sexp))
      (point))
    ;; Post-match form
    (goto-char (match-end 0))
    (0 font-lock-constant-face))))

Note: If you set font-lock-multiline to a non-nil value, this works even when the anchor construct span multiple lines.

Lindydancer
  • 6,095
  • 1
  • 13
  • 25
  • Is that a secret? I think that is the purpose of the PRE-MATCH form. `The forms pre-form and post-form can be used to initialize before, and cleanup after, anchored-matcher is used. Typically, pre-form is used to move point to some position relative to the match of matcher, before starting with anchored-matcher. post-form might be used to move back, before resuming with matcher.` from: https://www.gnu.org/software/emacs/manual/html_node/elisp/Search_002dbased-Fontification.html . ps: I love font-lock-studio. Thanks for the effort and contribution. Feels like sorcery. – Cheeso Sep 20 '18 at 15:28
  • The pre- and post-match forms are often used for the things you quoted (moving the point around, cleanup, etc.). The "secret" I was referring to is that the *value* the pre-match form evaluates to, when an integer, is used by font-lock as the search limit for the inner anchored regexp. -- Thanks for your kind words regarding font-lock-studio. I've been writing font-lock rules for a long time and had the idea in my head for quite some time. Once, when I got stuck with one of my font-lock packages, I put it on hold, wrote font-lock-studio, and used that to finish the original package. – Lindydancer Sep 21 '18 at 19:50