6

How can I open a file with a BOM Byte Order Mark and ensure that BOM is gone when I save the file.

Dan
  • 32,584
  • 6
  • 98
  • 168
Dan King
  • 61
  • 2

3 Answers3

5

Check out the file encoding with C-h C RET. If there is a byte order mark in the file, you may see something like this in the *Help* buffer:

Coding system for saving this buffer:
  U -- utf-8-with-signature-unix

Instead of -unix, it might say -dos or -mac, and possibly it might start with some variant of utf-16 instead of utf-8.

If you don't want the byte order mark, just switch to the corresponding encoding without the -with-signature part. Use C-x C-m f (set-buffer-file-coding-system) to achieve this.

(Disclaimer: I don't know for how long the -with-signature UTF encodings have been part of emacs. If you run an old emacs, this may not work for you.)

Harald Hanche-Olsen
  • 2,401
  • 12
  • 15
  • 2
    I got so irritated about the apparent need for `find-file-literally` (see the other answer by me), I decided to dig harder, and found this. Now I am happier. Hope it will be useful to others. See also [this answer](https://stackoverflow.com/a/17864979/1842907). – Harald Hanche-Olsen Oct 17 '17 at 16:33
  • 2
    Sanity check: if emacs tells you that `RET runs the command newline` when you check the file encoding, be sure you used a *capital* C in your `C-h C RET`. – Honore Doktorr May 17 '18 at 15:50
4

M-x set-buffer-file-coding-system, hit TAB to get a completion list, select encoding you desire (one without BOM), then save the file.

Willy Lee
  • 460
  • 4
  • 10
2

A comment by SabreWolfy really should be provided as an answer: As explained in a blog post by Bojan Nikolic, the BOM (which really is the Unicode character U+FEFF ZERO WIDTH NON-BREAKING SPACE placed at the beginning of a file) is sometimes used to indicate the file encoding (and byte order, in the case of UTF-16). When emacs opens a file with a BOM, it will not show it as part of the buffer, so you can't get rid of it.

The remedy is to visit the file with M-x find-file-literally. If the encoding was UTF-8, the BOM will show up at the beginning of the file as three binary characters: \357\273\277 (they may look like twelve characters, but there really are only three). Delete those three characters, then save the file.

Harald Hanche-Olsen
  • 2,401
  • 12
  • 15
  • Community wiki because I really can't claim any credit for this answer. I don't know if any non-Unicode encodings use BOMs too? If they do, feel free to edit the answer to reflect that. – Harald Hanche-Olsen Oct 17 '17 at 14:02