I'm trying to write a mode in SMIE, to figure out how it works and to create some documentation.
build.ninja (a build system used by Meson and others) is a perfect candidate due to its very simple syntax. So despite there being a ninja-mode, I decided to create one based on SMIE and to possibly include it into upstream Emacs.
Syntax showcase (barring that there's a few more keywords and rule only allows special variables):
rule my_rule_title
local_var_rule = some text
command = cc -c $in -o $out
global_var = some text
build path/obj.o: my_rule_title path/obj.c
local_var_build = some text
Basically, rule and build accept a few parameters and have a body. The body only allows variable assignments to appear and is characterized by non-zero indentation level. So you can see local_var_rule is inside a rule region, but global_var is outside it.
I have spent some time studying other SMIE-based modes, reading documentation, and writing code. At this point I've monkey-typed something working, but not really properly, and I think main reason is that I don't know if my grammar is correct (unlikely). My current grammar is attached at the bottom.
So, here are questions I didn't find answers to:
Does a grammar have to cover complete buffer or only the interesting parts?
To give an example: the
build.ninjaexample above hasruleandbuildparagraphs. Obviously that means I have to write at least two SMIE rules: one is to cover possible appearance ofruleand another forbuild. But once that's done, do I also write a rule that connects the two on the level of an entire buffer, i.e. to say "the buffer is expected to be composed ofrules andbuilds"? Or having just the two is enough?How do I define what symbols an identifier contains? For example a
buildtitle may contain slashes and escaped spaces, but variable andrulenames are not allowed to have them.How to define newline as a separator? E.g. a
buildends with a newline, and then follows a region of assignments. I tried using a"\n", but I'm not sure if SMIE interprets the backslash, nor that a\nwill work with other newline types.- sub-question: defining that a line is allowed to continue on the next one if the previous line ended with a
$(i.e. escapes the newline). I guess if"\n"works, then I just have to create a separate rule for"$\n". But I decided to question that explicitly in case the answer to3is more complicated than that.
- sub-question: defining that a line is allowed to continue on the next one if the previous line ended with a
How to define a non-zero space token, that is to define that the variable assignment belongs to the previous
buildorrule?
My last attempt is the grammar below. I had some other variants that worked incorrectly, but they were incomplete as well. For this post I created a more complete version, but it does not compile for me because it doesn't like text definition, it throws Adjacent non-terminals: id text.
(defvar test-mode-smie-grammar
(smie-prec2->grammar
(smie-bnf->prec2
'((id)
(path) ;; TODO: define how it's different from `id'
(statements (statement)
(statement "\n" statements))
(statement (top_decls) (variable))
(text (id text)
(text "\n"))
(variable (id "=" text))
(build_title (path build_title)
(path ":"))
(top_decls
("rule" id)
("build" build_title ":" text)
)
))))