9

Short version: Can I make Emacs show \ff or \xff instead of \377?

Long version: Suppose you open a file that is not entirely text and has some binary data (say a PostScript or PDF file). For example, suppose you open GNU Emacs Reference Card (PDF).

Screenshot of Emacs (Aquamacs) viewing refcard.pdf

Then, for bytes outside the ASCII printable range (32–126),

  • Emacs shows the "high" bytes (bytes with value 128 to 255) as octal escape sequences: 128 is shown as \200, 129 is shown as \201, …, 255 is shown as \377.
  • Emacs shows the bytes 0 to 31 (other than byte 9 which is shown as a tab not ^I, and byte 10 which is shown as a newline not ^J) as a caret followed by the character that is 64 ahead: byte 0 is shown as ^@, byte 1 is shown as ^A, …, byte 26 is shown as ^Z, byte 27 is shown as ^[, …, byte 31 is shown as ^_. Also, Emacs shows byte 127 as ^?.

I know that the reason Emacs shows octal is historical: at some point a few decades ago, octal was more commonly in use. (For example, man ascii starts with octal first, and TeX supports octal escape sequences.) But as octal is less useful than hexadecimal these days (e.g. to compare with the output of hexdump or Python byte-string representations), I'd like to see hexadecimal escape sequences. How can I change this?

(Note: the octal escape sequences are shown highlighted instead of looking like regular text, and of course it's not possible to step "into" the escape character (i.e. hitting C-f at the point before \343 takes you to the point after \343); I'd like to retain this.)

ShreevatsaR
  • 880
  • 6
  • 19

4 Answers4

9

edit: With Emacs 26.1 or later, it's a (setq display-raw-bytes-as-hex t) away.

No, you can't. The display of unprintables above the printable ASCII range is hardcoded in xdisp.c:

if (CHAR_BYTE8_P (c))
  /* Display \200 instead of \17777600.  */
  c = CHAR_TO_BYTE8 (c);
len = sprintf (str, "%03o", c + 0u);

I sent a patch fixing this to debbugs.

wasamasa
  • 21,803
  • 1
  • 65
  • 97
  • "No, you can't" is wrong, see [Gilles' suggestion](https://emacs.stackexchange.com/questions/33117/showing-bytes-as-hexadecimal-escapes-rather-than-octal-escapes#comment51136_33117), but +1 anyway for giving a patch to fix this properly. – npostavs May 28 '17 at 21:37
  • Huh, just when I thought you can't hack your way around this one, someone else proves me wrong. Thanks! – wasamasa May 29 '17 at 06:13
  • 1
    Oh nice, wonderful! Getting a patch into Emacs is not entirely impossible it appears. :-) Thanks for your work… look forward to this being released in Emacs 26. – ShreevatsaR Jun 08 '17 at 22:17
  • 1
    Works great in Emacs 26! Thanks!!! (You may want to edit your answer now.) – Michael Hoffman May 14 '19 at 12:08
7

I figured it out thanks to the answer by Gilles and the 2010/2011 thread on gnu.emacs.help called “How switch from escaped octal character code to escaped HEX?” (Google Groups, Nabble).

The details of how Emacs displays characters are in the section Display > Text Display (“How Text Is Displayed”) of the Emacs manual (C-h r), and section Display > Character Display of the Emacs Lisp Reference Manual. The thing to do is to change the display table for the characters 128 to 255 (and whatever other characters one wants displayed as hexadecimal escapes).

I had to make two minor changes from the answer by Gilles:

  1. Instead of something like

    (aset standard-display-table 128 [?\\ ?8 ?0])
    

    I had to use something like

    (aset standard-display-table (unibyte-char-to-multibyte 128) [?\\ ?8 ?0])
    
  2. Setting standard-display-table isn't always enough, because some modes (like global-whitespace-mode) may mess it up. And then it appears you need to set buffer-display-table instead.

So instead I made an interactive function that I can invoke when I want the display to change in a specific buffer.

(defun use-hex-not-octal ()
  "Use hexadecimal escape sequences instead of octal."
  (interactive)
  (require 'cl-lib)
  (unless buffer-display-table
    (setq buffer-display-table (make-display-table)))
  (setq unprintable (append (number-sequence 127 255) (number-sequence 0 8) (number-sequence 11 31)))
  (cl-loop
   for x in unprintable
   do (aset buffer-display-table (unibyte-char-to-multibyte x)
            (cl-map 'vector
                    (lambda (c) (make-glyph-code c 'escape-glyph))
                    (format "\\%02x" x)))))

With this, if I open refcard.pdf and run M-x use-hex-not-octal, I get the following, for the same region as in the question:

refcard.pdf with M-x use-hex-not-octal

ShreevatsaR
  • 880
  • 6
  • 19
4

You can do it with display tables. This may be a little clumsy and I haven't investigated how this might interfere with packages that use display tables for their own purposes, but the basic use case works.

(require 'cl-lib)
(setq standard-display-table (make-display-table))
(cl-loop
 for x from 128 to 255
 do (aset standard-display-table x
      (cl-map 'vector
          (lambda (c) (make-glyph-code c 'escape-glyph))
          (format "\\%02x" x))))
  • Thanks, this was helpful so I'm accepting this. I had to make some minor changes which are in [my answer](https://emacs.stackexchange.com/questions/33117/showing-bytes-as-hexadecimal-escapes-rather-than-octal-escapes#33129); please take a look and let me know if I should correct anything. – ShreevatsaR May 29 '17 at 03:48
2

Emacs' hexl mode should do what you want - it's a major mode which provides support for viewing and editing binary files. Use  M-x hexl-find-file instead of C-x C-f to visit the file to get started. More details can be found in the Emacs info manual, or at https://www.gnu.org/software/emacs/manual/html_node/emacs/Editing-Binary-Files.html.

stevoooo
  • 737
  • 3
  • 8
  • 1
    No I don't want hexl mode: postscript files are mostly text with only occasional binary data, and it's not convenient to switch to hexl-mode and lose much text editing functionality. Let me add a screenshot to the question to clarify. – ShreevatsaR May 28 '17 at 20:27
  • Ah, I know what you mean, but don't know of any easy way that you can change that. I suspect display tables might be involved somewhere... – stevoooo May 28 '17 at 20:46
  • Thanks for your suggestion though. I didn't downvote btw! – ShreevatsaR May 28 '17 at 22:03