3

There is a disassembler for functions, but is there something that will disassemble a bytecode file?

Thoughts on how to accomplish?

Drew
  • 75,699
  • 9
  • 109
  • 225
rocky
  • 888
  • 7
  • 26
  • 1
    I'd figure out how exactly the .elc format works. It appears to be designed so that you can just `read` the file, if that's indeed the case you could hopefully do a tree mapping operation and disassemble every bytecode object you come across. – wasamasa Oct 04 '17 at 19:09

2 Answers2

4

With cl-print.el (builtin as of Emacs 26), this is actually pretty easy to do almost perfectly:

(require 'cl-print)

(defun disassemble-file (filename)
  (let ((inbuf (find-file-noselect filename)))
    (with-current-buffer inbuf
      (goto-char (point-min)))
    (with-current-buffer (get-buffer-create "*file disassembly*")
      (erase-buffer)
      (condition-case ()
          (cl-loop with cl-print-compiled = 'disassemble
                   for expr = (read inbuf)
                   do (pcase expr
                        (`(byte-code ,(pred stringp) ,(pred vectorp) ,(pred natnump))
                         (princ "TOP-LEVEL byte code:\n" (current-buffer))
                         (disassemble-1 expr 0))
                        (_ (cl-prin1 expr (current-buffer))))
                   do (terpri (current-buffer)))          
        (end-of-file nil))
      (goto-char (point-min))
      (display-buffer (current-buffer)))))

The only thing this misses is disassembling the bytecode of closures which will be constructed by disassembled code, e.g.:

7       constant  make-byte-code
8       constant  0
9       constant  "\301\300!\205    \0\302\300!\207"
10      constant  vconcat
11      constant  vector
12      stack-ref 5
13      call      1
14      constant  [buffer-name kill-buffer]

Perhaps disassemble could be improved to handle this case better.

npostavs
  • 9,033
  • 1
  • 21
  • 53
2

Files with byte-code contain readable functions just like .el files, with the biggest difference being how docstrings are stored, it's explained here in the elisp manual. If you want to dump out disassembly for every byte-code object, you'd have to write code that walks across the expressions in the .elc file and writes them somewhere.

  • Yes, I got that from wasamasa's useful comment. I was (and still am) looking for something a little more detailed, such as code like https://github.com/rocky/elisp-decompile/blob/master/elisp/dedis.el#L45-L58 which does the trivial part, but there still is walking the expressions. It also looks like there are some constants have a value of which need more information filled out. – rocky Oct 05 '17 at 20:44
  • `read` can take a marker as an argument and advance it, that should make things easier. –  Oct 05 '17 at 22:21