You can write a script that calls file
, and use a case-statement to check for the cases you are interested in.
For example
#!/bin/sh
case $(file "$1") in
(*script*|*\ text|*\ text\ *)
echo text
;;
(*)
echo binary
;;
esac
though of course there may be many special cases which are of interest. Just checking strings
on a copy of libmagic
, I see about 200 cases, e.g.,
Konqueror cookie text
Korn shell script text executable
LaTeX 2e document text
LaTeX document text
Linux Software Map entry text
Linux Software Map entry text (new format)
Linux kernel symbol map text
Lisp/Scheme program text
Lua script text executable
LyX document text
M3U playlist text
M4 macro processor script text
Some use the string "text" as part of a different type, e.g.,
SoftQuad troff Context intermediate
SoftQuad troff Context intermediate for AT&T 495 laser printer
SoftQuad troff Context intermediate for HP LaserJet
likewise script
could be part of a word, but I see no problems in this case. But a script should check for "text"
as a word, not a substring.
As a reminder, file
output does not use a precise description which would always have "script" or "text". Special cases are something to consider. A followup commented that the --mime-type
works while this approach would not, for .svg
files. However, in a test I see these results for svg-files:
$ ls -l *.svg
-r--r--r-- 1 tom users 6679 Jul 26 2012 pumpkin_48x48.svg
-r--r--r-- 1 tom users 17372 Jul 30 2012 sink_48x48.svg
-r--r--r-- 1 tom users 5929 Jul 25 2012 vile_48x48.svg
-r--r--r-- 1 tom users 3553 Jul 28 2012 vile-mini.svg
$ file *.svg
pumpkin_48x48.svg: SVG Scalable Vector Graphics image
sink_48x48.svg: SVG Scalable Vector Graphics image
vile-mini.svg: SVG Scalable Vector Graphics image
vile_48x48.svg: SVG Scalable Vector Graphics image
$ file --mime-type *.svg
pumpkin_48x48.svg: image/svg+xml
sink_48x48.svg: image/svg+xml
vile-mini.svg: image/svg+xml
vile_48x48.svg: image/svg+xml
which I selected after seeing a thousand files show only 6 with "text"
in the mime-type output. Arguably, matching the "xml" on the end of the mime-type output could be more useful, say, than matching "SVG", but using a script to do that takes you back to the suggestion made here.
The output of file
requires some tuning in either scenario, and is not 100% reliable (it is confused by several of my Perl scripts, calling them "data").
There is more than one implementation of file
. The one most commonly used does its work in libmagic
, which can be used from different programs (perhaps not directly from zsh
, though python
can).
According to File test comparison table for shell, Perl, Ruby, and Python , Perl has a -T
option which it can use to provide this information. But it lists no comparable feature for zsh
.
Further reading:
file
is a standard utility and can run through the file magic for determining file types to the best of its abilities. It can tell most text formats and does a pretty decent job on binary formats. If all you're trying to do is find out if a file is text or not, that's the command you're interested in. – Bratchley Apr 10 '16 at 16:37file
will print, e.g.shell script
, for some files I would like classified as "text". Is there a way to getfile
to print justtext
orbinary
? – kjo Apr 10 '16 at 16:48cut
commands. – Bratchley Apr 10 '16 at 17:18file
output tocut
is the solution - sure, there's a missing space which makes it fail and that has made most people there address the Y instead of the X but Stéphane's comments and answer show the proper way to determine whether the file is text or not. – don_crissti Apr 10 '16 at 21:15# more vi
yields ******** vi: Not a text file ******** – AbraCadaver Apr 11 '16 at 15:06