I have this:
find . -type f -exec file {} + | grep ASCII
This is the output:
I want to know if is possible to show the size of the files and their path, for all files that file
indicates as ASCII
.
I have this:
find . -type f -exec file {} + | grep ASCII
This is the output:
I want to know if is possible to show the size of the files and their path, for all files that file
indicates as ASCII
.
This uses cut to extract the filename from the output of file | grep ASCII
, and then pipes it into xargs stat -c ...
to display only filename and size:
find . -type f -exec file {} + | grep ASCII | cut -d: -f1 | xargs -d'\n' -r stat -c '%n %s'
If you want the size before the filename, use '%s %n'
in the stat
command.
It will cope with filenames that contain any character except :
or newline. It assumes a GNU system (for -d
and that stat
syntax). It could give false positives if ASCII
is present in the file path.
I would use a shell loop instead. If you are using bash
, you can make **
recurse into subdirectories by running shopt -s globstar
. As explained in man bash
:
globstar
If set, the pattern ** used in a pathname expansion con‐
text will match all files and zero or more directories
and subdirectories. If the pattern is followed by a /,
only directories and subdirectories match.
So, with that in mind, you could use the following loop:
shopt -s globstar
for file in **; do
[ -f "$file" ] && file "$file" | grep -q "ASCII" && stat -c '%n %s' "$file"
done
shopt -s globstar
first. That's what I explain in the first paragraph.
– terdon
Oct 19 '15 at 15:33
With zsh
:
isascii() [[ $(file -b --mime-encoding - < ${1-$REPLY}) = us-ascii ]]
zmodload zsh/stat
zstat -n +size -- **/*(D.L+1+isascii)
Broken down:
**/*
recursive globbing, a feature introduced by zsh in the early 90s and later copied by some other shells like ksh93
, fish
, bash
, yash
and tcsh
.(...)
, glob qualifier: another 90s feature but still unique to zsh
as of today. Allows to further specify which files are included in the glob based on file metadata or to change the expanded value. Here:
D
: include Dot (hidden) files.
: include only regular filesL+1
: only bother considering files that are more than 1 byte large (as otherwise file
won't tell you anything about them)+isascii
: calls the isascii
command for each matching file to decide whether to include the file.isascii
is defined as a function that calls file
on $REPLY
(that's how file names are passed for functions called by glob qualifiers, functions may modify it or return more files in the $reply
array). We use ${1-$REPLY}
here so we can also use that function on a file given as argument, and don't modify $REPLY
, just return the decision via the exit status.
With -b
and --mime-encoding
, file
(at least the implementation from libmagic
) outputs only the guessed encoding. It's a lot more reliable than calling grep ASCII
on the output of file the-file
,
as ASCII
may occur there in the file path or other information extracted from the file.
file …
- <
filename
(which I learned from you). :-)
– G-Man Says 'Reinstate Monica'
Oct 22 '15 at 03:18
Try this:
for file_name in find . -type f -exec file {} + | grep ASCII | awk -F ':' '{print $1}'
; do ls -lrth ${file_name}; done
wc(1)
gives characters/words/lines for text files.ls(1)
gives the size of files with-l
. – vonbrand Oct 18 '15 at 23:59