2

TIL that trying to compile anything with pcase matching in it may lead to an explosion of code generated during expansion. Even the most innocent looking patterns may generate hundreds of branches and thousands of lines of code. TBH I would never have noticed had it not been for a small macro I wrote yesterday. The exact code is unimportant but the gist of it is that it had a pcase with 4 clauses doing some matching on lists and what not. That macro was the last in the project I've been working on, so naturally I attempted to byte-compile. Upon reaching the macro the compiler would just sit there. Out of curiosity I let it - only to be disappointed with Error: Bytecode overflow some 10min later. O..K..

M-x pp-macroexpand-all-last-sexp ... 160K lines of expanded code later I was enlightened. If you ever looked at what pcase list pattern expands into, you wouldn't be surprised. There isn't any magic there, the rewriting rule is straightforward. Problem is it doesn't scale if you perform full macro-expansion, and IIUC that is what byte-compile does. Let's use a tiny contrived example just to get the feel for "scale":

(pcase e
  (`(,a ,(or (or `(0) `[0]) 42)) a))

Full expansion of this is 70 lines of pretty-printed code but you can imagine how a bunch of nested patterns and several clauses create all possible permutations and baloon into thousands. My first thought was that it's the (or .. ..) pattern that's not cleverly handled and replacing it with explicit clauses may help fight branching:

(pcase e
  (`(,a (0)) a)
  (`(,a [0]) a)
  (`(,a 42) a))

Well, macroexpand-all and you get the exact same 70 loc, so, yeah, they're precisely equivalent. I bet pcase actually re-writes one into the other.

It isn't a problem at runtime i.e. if you just eval and run the interpreter, because in 99.9% of cases the match fails at the head, so all those branches never get generated by the expander when you just interpret and match. That's why I never noticed any issues say with performance.

I guess I have several questions after this long-winded diatribe:

  1. How do you byte-compile code that uses pcase?
  2. Maybe you don't compile, but then how do you get reasonable performance?
  3. Maybe you just don't use pcase ever? Is this why so much Elisp is poisoned with caddaar and friends? I really want my pattern matching.
  4. Is there anything obvious I've overlooked?

Didn't get any relevant hits searching emacs-devel archives.

zeRusski
  • 335
  • 1
  • 8
  • Personally, I don't use `pcase` at all. It's ugly and doesn't buy me much. On the rare occasion I need pattern matching, I use `-let` from dash.el. – wasamasa Dec 15 '18 at 20:28
  • You ask multiple questions here. As such it seems too broad for SE and risks being closed as such. This is not a discussion forum. Perhaps try Reddit or emacs-devel@gnu.org or help-gnu-emacs@gnu.org? Or even `M-x report-emacs-bug` if you have a reproducible case that you think is particularly relevant. – Drew Dec 15 '18 at 21:34
  • You could try to rewrite your question as something along the lines of "how do I make `pcase` perform better and `byte-compile` without errors for non-trivial cases?" –  Dec 15 '18 at 22:12
  • 1
    I'm glad the 70-lines output is the same: pcase doesn't rewrite one into the other, but it should indeed generate the same code. BTW, if you don't byte-compile, the macro will still expand the code in exactly the same way before executing it. Also, the fact that the code is very large doesn't mean it will run slowly. And yes, the size can explode in "pathological" cases. If you show your problem code maybe we can help you avoid those pathological behaviors. – Stefan Dec 16 '18 at 01:15
  • @wasamasa ugly as they may be, they are also quite powerful. But hey, this is Lisp so we're in luck. There https://github.com/vkz/multi I fixed them. Could use a code review from an accomplished Emacs hacker. I just got carried away there without any real Elisp knowledge to cash that check. Consider helping out – zeRusski May 08 '19 at 14:01

0 Answers0