2

My current code is like so:

scan.sh:

#!/bin/bash
while IFS= read -r line;
do
    byte = $(stat -c%s "$line");
    echo "$line : $byte";
done< <(ls *.$1)

The output would be like this:

./scan.sh cpp
./scan.sh: line 4: byte: command not found
arraysum.cpp :
./scan.sh: line 4: byte: command not found
countLines.cpp :
./scan.sh: line 4: byte: command not found
createtext.cpp :
./scan.sh: line 4: byte: command not found
multiproc1.cpp :
./scan.sh: line 4: byte: command not found
myWc.cpp :
./scan.sh: line 4: byte: command not found
test.cpp :

Basically my code will take one syntax and will search the directory based on that syntax. The problem is I want it to print out "name of file" + "byte size of file", only I can't seem to get that working.

Li Wang
  • 21

2 Answers2

7

In the syntax of Bourne-like shells like bash, there must not be any space around the = sign in assignments.

byte=value

Here though, parsing the output of ls is a bad idea.

You can just write it:

#! /bin/sh -
stat -c '%n: %s' -- *."$1"

If you do need a loop, just write it:

#! /bin/zsh -
for file in *.$1; do
  stat -c '%n: %s' -- $file
done

Or if you have to use bash:

#! /bin/bash -
shopt -s failglob
for file in *."$1"; do
  stat -c '%n: %s' -- "$file"
done
0

Here's a simple way to do this that should work with the Bourne shell and its descendants (including bash and ksh), if you don't care too much about the exact output format:

$ for file in *; do if [ -f "$file" ] && [ -r "$file" ]; then wc -c "$file"; fi; done
      23 HEAD
     111 config
      73 description

If you also don't care too much about errors and corner cases (in which case, good luck to you):

$ for file in *; do wc -c $file; done

Notes:

  • If you're writing this with bash or ksh, you're probably better off using (( )) or [[ ]] instead of [ ]. (source) Also, below, consider using $(wc -c <"$file") instead of `wc -c <"$file"`). (source)

  • -f tests to see if what you're looking at is an ordinary file (not a directory, device, pipe, socket, tty, or generally some weird thing that can't be said to have a size in bytes). -r tests that the file is readable, i.e., that wc has a chance of succeeding; if you're looking at huge files or files that you can't read, use stat as per your original version and Stéphane's answer.

  • The quotes ("$file") are necessary if any of the files have spaces or tabs in them (e.g., a file named my stuff.txt).

  • If you do care about the exact format, you should probably use some combination of `wc -c <"$file"` (which will not print the filename) and echo or echo -n (which will print whatever you'd like).

  • If the files of interest are arguments to a script, in that script use "$@" (explanation).

I agree with @stéphane-chazelas that you shouldn't parse the output of ls; but if you do, you don't need process substitution. You can more simply read the output of the command:

ls | while IFS= read -r blah blah blah

or, if you want to recurse through a directory:

find blah -type f -print | while IFS= read -r blah blah

or better still:

find blah -type f print0 | xargs -o blah blah

where -print0 and xargs -0 again properly handle filenames with spaces or tabs

  • 1
    The problem with using wc here is that it will read the whole contents of the file, which can quickly get pretty expensive on large files... Using stat is the correct approach here. – filbranden Oct 07 '18 at 15:24
  • "pretty expensive on large files": I agree (and said so). But, what's a large file? To me, it's hundreds of megabytes. For anything smaller, performance optimization is unnecessary. Since @li-wang appears to be counting bytes in .cpp files, I think wc is appropriate. – user10543 Oct 07 '18 at 19:56