I'm trying to write a mode in SMIE, to figure out how it works and to create some documentation.
build.ninja
(a build system used by Meson and others) is a perfect candidate due to its very simple syntax. So despite there being a ninja-mode
, I decided to create one based on SMIE and to possibly include it into upstream Emacs.
Syntax showcase (barring that there's a few more keywords and rule
only allows special variables):
rule my_rule_title
local_var_rule = some text
command = cc -c $in -o $out
global_var = some text
build path/obj.o: my_rule_title path/obj.c
local_var_build = some text
Basically, rule
and build
accept a few parameters and have a body. The body only allows variable assignments to appear and is characterized by non-zero indentation level. So you can see local_var_rule
is inside a rule
region, but global_var
is outside it.
I have spent some time studying other SMIE-based modes, reading documentation, and writing code. At this point I've monkey-typed something working, but not really properly, and I think main reason is that I don't know if my grammar is correct (unlikely). My current grammar is attached at the bottom.
So, here are questions I didn't find answers to:
Does a grammar have to cover complete buffer or only the interesting parts?
To give an example: the
build.ninja
example above hasrule
andbuild
paragraphs. Obviously that means I have to write at least two SMIE rules: one is to cover possible appearance ofrule
and another forbuild
. But once that's done, do I also write a rule that connects the two on the level of an entire buffer, i.e. to say "the buffer is expected to be composed ofrule
s andbuild
s"? Or having just the two is enough?How do I define what symbols an identifier contains? For example a
build
title may contain slashes and escaped spaces, but variable andrule
names are not allowed to have them.How to define newline as a separator? E.g. a
build
ends with a newline, and then follows a region of assignments. I tried using a"\n"
, but I'm not sure if SMIE interprets the backslash, nor that a\n
will work with other newline types.- sub-question: defining that a line is allowed to continue on the next one if the previous line ended with a
$
(i.e. escapes the newline). I guess if"\n"
works, then I just have to create a separate rule for"$\n"
. But I decided to question that explicitly in case the answer to3
is more complicated than that.
- sub-question: defining that a line is allowed to continue on the next one if the previous line ended with a
How to define a non-zero space token, that is to define that the variable assignment belongs to the previous
build
orrule
?
My last attempt is the grammar below. I had some other variants that worked incorrectly, but they were incomplete as well. For this post I created a more complete version, but it does not compile for me because it doesn't like text
definition, it throws Adjacent non-terminals: id text
.
(defvar test-mode-smie-grammar
(smie-prec2->grammar
(smie-bnf->prec2
'((id)
(path) ;; TODO: define how it's different from `id'
(statements (statement)
(statement "\n" statements))
(statement (top_decls) (variable))
(text (id text)
(text "\n"))
(variable (id "=" text))
(build_title (path build_title)
(path ":"))
(top_decls
("rule" id)
("build" build_title ":" text)
)
))))