The issue is that your code is doing cd ..
before being done with all the files in a directory. In general, you don't have to cd
into directories to get filenames from them, and going back and forth into directories in loops can be confusing. It can also lead to strange issues if you are redirecting output to filenames with relative paths, etc. inside the loops. In this script, you also would not know where (in what directory) the file was found, because you always look in the current directory.
Fixing this by not using cd
, and also by not using ls
, which allows the script to work with filenames that have spaces and other unusual characters in them:
#!/bin/sh
find_smaller () {
dir=$1
size=$2
for pathname in "$dir"/*; do
if [ -f "$pathname" ]; then
# this is a regular file (or a symbolic link to one), test its size
filesize=$( wc -c <"$pathname" )
if [ "$filesize" -lt "$size" ]; then
printf 'Found %s, size is %d\n' "$pathname" "$filesize"
fi
elif [ -d "$pathname" ]; then
# this is a directory (or a symbolic link to one), recurse
printf 'Entering %s\n' "$pathname"
find_smaller "$pathname" "$size"
fi
done
}
find_smaller "$@"
In the code above, $pathname
will be not only the filename of the current file or directory that we're looking at, but also its path relative to the starting directory.
Note also the quoting of all variable expansions. Without quoting the $pathname
variable, for example, you would invoke filename globbing if a filename contained characters like *
or ?
.
See also:
Using bash
and its globstar
shell option. With this option set, the **
glob pattern matches all the pathnames beneath the given directory. This means we don't have to explicitly walk the directory structure in our script:
#!/bin/bash
dir="$1"
size="$2"
shopt -s globstar
for pathname in "$dir"/**; do
[ ! -f "$pathname" ] && continue
filesize=$( wc -c <"$pathname" )
if [ "$filesize" -lt "$size" ]; then
printf 'Found %s, size is %d\n' "$pathname" "$filesize"
fi
done
Rather than writing your own directory tree walker, you may instead use find
. The following find
command does what your code tries to do:
find /home/161161 -type f -size -100c
As a script:
#!/bin/sh
dir=$1
size=$2
find "$dir" -type f -size -"$size"c
The only slight difference between the explicit directory walker and the find
variation is that find
(when used as above) will ignore symbolic links, while the shell function above it will resolve symbolic links, possibly causing directory loops to be traversed infinitely, or the same data to be counted more than once.
Using find
, the contents of the files will not be read to figure out the file size. Instead a lstat()
library call will be made to query the filesystem for the file's size. This is many times faster than using wc -c
!
On most (but not all) Unices, you may also use the command line utility stat
to get the file size. See the manual for this utility on your system for how to use it (it works differently on Linux and BSD).
ls
, should use[[
instead of[
. For readability it should also use a different name thanFunction
(which is only an uppercase away from a Bash keyword) and should be indented properly. – l0b0 Jun 15 '18 at 03:03find /dir -size -300c
will list the full filenames; if you really want ony the last branch, which seems unlikely to be useful, add-printf %s
or pipe the output throughsed 's:.*/::'
orawk -F/ '{print $NF}'
. If you really want a script, you can avoid theawk
for each file by using redirectionwc -c <"$file"
so thewc
output is only the number and no filename. – dave_thompson_085 Jun 15 '18 at 03:55wc -c <"$file"
will read the contents of each and every file to count the number of bytes. You can get the file size directly withstat -c %s "$file"
. But as has already been pointed out you can just usefind
. – Chris Davies Jun 15 '18 at 06:27stat -f %z
on a BSD system. – Kusalananda Jun 15 '18 at 07:00size=$(wc -c <$item)
. If you write it like this, you don't need awk. – user1934428 Jun 15 '18 at 07:13cd ..
is inside the loop, so it will run for every file... – ilkkachu Jun 15 '18 at 09:06