0

I am wondering whether current versions of Emacs have any built in mechanism for guessing whether a file (that my elisp code is about to load, or perhaps has just loaded, into a buffer) is 'binary' rather than 'text'.

This is necessarily a fuzzy question, because there's no definitive dividing line between the two. A typical heuristic for declaring a file to be 'binary' is to check whether the first N bytes of the file contain any ASCII control characters (other than TAB, C-j, C-m, C-l). A less algorithmic heuristic is that if you interactively load a file into a buffer and then immediately realize that you need to put it in hexl-mode to edit it, it's binary.

Most of the files that Emacs can automatically apply a major mode to are text, but not all (the most prominent exception being various image formats). However, files that Emacs does not apply a major mode to, have a good chance of still being text.

I'm asking this question because I am writing a batch script that processes a whole bunch of files, and it should not make any changes to files that are binary. Weeding them out by hand would be substantially more work (not to mention error-prone) than persuading Emacs to figure it out itself.

The best idea I have so far is to port https://github.com/audreyfeldroy/binaryornot to elisp, but if there's something built in already, that would be a lot of unnecessary work.

zwol
  • 272
  • 1
  • 8
  • 1
    If you are on Linux, is there anything wrong with the `file` command-line utility? On my Fedora 33 system, `file -e soft ` does a pretty good job. – NickD Oct 17 '21 at 22:22
  • Why do you need to port anything to elisp? Emacs is perfectly capable of running an external program and interpreting the results. – NickD Oct 17 '21 at 22:24
  • Are you using a version control system? For example, in subversion, there are properties that control if a file is binary or not. – Lindydancer Oct 18 '21 at 10:46

0 Answers0