Parsing parentheses: smie vs syntax table

Question

I am maintaining a mode for a programming language with... let's say "annoying" syntax constructs.

For example, the angle brackets (no idea if it is the correct word) < and > are parentheses. However, the word -> is also a valid token, does not count as a closing parenthesis, and to top it all, should only occur inside a pair of < and >.

If I understand correctly, there is no way to get emacs to recognise this using only syntax tables, is that correct?

Hence the question: the mode is using smie for what cannot be done with syntax tables, and smie is also the canonical answer to define multi-character parens. So I guess that I will have to get smie to replace the syntax table for this pair of parentheses.

The question is: what does need to be set up in smie, in order to replace syntax table features for a particular paren pair?

The question is motivated by the manual: it is advised to let syntax tables do their job whenever possible. Hence, I'm afraid that with a naive set-up, I will run into problems later, or worse, that it will cause problems behind the scenes.

What I can think of (aka the "naive set-up" mentioned above):

removing the paren category for characters < and > in the syntax table;
adding ("<" whatever ">") in the BNF grammar;
making the lexer recognize the token < (resp >) as "<" (resp. ">") instead of "".

Is it enough?

And, in case it would actually be, bonus question: is there any good reason to leave the handling of sexps to the syntax table instead of smie?

score 7 · Accepted Answer · answered Dec 05 '14 at 13:50

7

Speed is a good reason to let the paren-matching be performed by syntax-tables where possible. In your case, the "parens" are not multi-char, so you can definitely use syntax-tables for them. In order to avoid treating -> as a paren closer, you can setup a syntax-propertize-function which modifies the syntax of those > which appear in ->. Something like

(setq-local syntax-propertize-function
            (syntax-propertize-rules
             ("-\\(>\\)"
              (1 (let ((ppss (save-excursion (syntax-ppss (match-beginning 0)))))
                   (and (nth 1 ppss) ;; Inside at least one level of parens.
                        (eq ?< (char-after (nth 1 ppss))) ;; Last level is <...>
                        (string-to-syntax ".")))))))

answered Dec 05 '14 at 13:50

Stefan

26,154
3
46
84

Aah, good ol' XY problem, good that I added the background before the question, then. Thanks! I'm not sure to completely understand your code, though: is it absolutely needed to check that we are inside a pair of `<` and `>`? I mentioned it in the question because it causes this token to *always* confuse the parser, but where would be the harm in "defusing" all `>` following a `-`? – T. Verron Dec 05 '14 at 13:59
There'd be no harm to Emacs. I just wrote the code this way to show how to do it, if you need it. But if `->` won't appear inside `(...)`, or if when it does you don't want it to close the `(`, then you don't need to be so picky. Then you can just use `(syntax-propertize-rules ("-\\(>\\)" (1 ".")))`. – Stefan Dec 05 '14 at 14:05
It shouldn't appear out of `<...>` (not in syntactically correct code), and if it did I wouldn't want it to close anything (it is an operator which sadly happens to contain a parenthesis). Actually, now that you mention it, I guess that with your code, if `->` appears out of a `<` construct, instead of closing the `)`, the parser will complain about a mismatched parenthesis, won't it? – T. Verron Dec 05 '14 at 14:13
Yes and no: the paren-mismatch is only checked in some rare circumstances (e.g. when you type it), but not when you do `C-M-f`. – Stefan Dec 05 '14 at 19:11

Parsing parentheses: smie vs syntax table

1 Answers1

Linked