-3

I have this:

find . -type f -exec file {} + | grep ASCII

This is the output:

That's is what is print in terminal.

I want to know if is possible to show the size of the files and their path, for all files that file indicates as ASCII.

4 Answers4

4

This uses cut to extract the filename from the output of file | grep ASCII, and then pipes it into xargs stat -c ... to display only filename and size:

find . -type f -exec file {} + | grep  ASCII | cut -d: -f1 | xargs -d'\n' -r stat -c '%n %s'

If you want the size before the filename, use '%s %n' in the stat command.

It will cope with filenames that contain any character except : or newline. It assumes a GNU system (for -d and that stat syntax). It could give false positives if ASCII is present in the file path.

cas
  • 78,579
  • @JoséSá If this answer solved your issue, please take a moment and accept it by clicking on the check mark to the left. That will mark the question as answered and is the way thanks are expressed on the Stack Exchange sites. – terdon Oct 19 '15 at 16:41
2

I would use a shell loop instead. If you are using bash, you can make ** recurse into subdirectories by running shopt -s globstar. As explained in man bash:

globstar
    If set, the pattern ** used in a pathname expansion con‐
    text will match all files and zero or  more  directories
    and  subdirectories.  If the pattern is followed by a /,
    only directories and subdirectories match.

So, with that in mind, you could use the following loop:

shopt -s globstar 
for file in **; do
    [ -f "$file" ] && file "$file" | grep -q "ASCII" && stat -c '%n %s' "$file"
done
terdon
  • 242,166
2

With zsh:

isascii() [[ $(file -b --mime-encoding - < ${1-$REPLY}) = us-ascii ]]
zmodload zsh/stat
zstat -n +size -- **/*(D.L+1+isascii)

Broken down:

  • **/* recursive globbing, a feature introduced by zsh in the early 90s and later copied by some other shells like ksh93, fish, bash, yash and tcsh.
  • (...), glob qualifier: another 90s feature but still unique to zsh as of today. Allows to further specify which files are included in the glob based on file metadata or to change the expanded value. Here:
    • D: include Dot (hidden) files
    • .: include only regular files
    • L+1: only bother considering files that are more than 1 byte large (as otherwise file won't tell you anything about them)
    • +isascii: calls the isascii command for each matching file to decide whether to include the file.
  • isascii is defined as a function that calls file on $REPLY (that's how file names are passed for functions called by glob qualifiers, functions may modify it or return more files in the $reply array). We use ${1-$REPLY} here so we can also use that function on a file given as argument, and don't modify $REPLY, just return the decision via the exit status.

    With -b and --mime-encoding, file (at least the implementation from libmagic) outputs only the guessed encoding. It's a lot more reliable than calling grep ASCII on the output of file the-file, as ASCII may occur there in the file path or other information extracted from the file.

0

Try this:

for file_name in find . -type f -exec file {} + | grep ASCII | awk -F ':' '{print $1}'; do ls -lrth ${file_name}; done