0

For the consistency check of a backup program, I want to define a function which counts all files in a directory including all files in subdirs, subsubdirs and so on.

The solution I am trying so far is as follows:

countfiles() {
  local cdir=$1
  local files=$(ls -la $cdir | grep -cv '^[dl]')

  local dirstring=$(ls -la $cdir | grep '^d' | egrep -o ' \.?[^[:space:].][^[:space:]]+$')
  local directories=(${dirstring//"\n"/})

  echo ${directories[@]}


  for dir in ${directories[@]}; do
    echo -n "$dir "
    echo -n 'filecount >> '
    local dirfiles=$(countfiles "$cdir/$dir")
    echo -n $dirfiles
    echo ' <<'
    #files=$(($files+$dirfiles))
  done

  echo $files

}

Which gives me the following output:

.config .i3 .scripts
.config filecount >> gtk-3.0 termite gtk-3.0 filecount >> 2 << termite filecount >> 2 << 1 <<
.i3 filecount >> 5 <<
.scripts filecount >> 2 <<
5

While the actualization of my $files counter is commented atm and I may need to unlocalize it, right now I set all variables as local to avoid any interference.

The directory tree is as follows:

/.scripts/backup_dotfiles.sh
/.config/termite/config
/.config/gtk-3.0/settings.ini
/.i3/config
/.i3/i3blocks.conf
/.i3/lockicon.png
/.i3/lockscreen.sh
/.gtkrc-2.0
/.bashrc
/.zshrc
/.i3
/.Xresources

My questions:

  • Why does it always count the files +1 except for the master directory?
  • Why does it count anything in the '.config' directory, as there are no files in there?
  • How can I fix this?
Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
tifrel
  • 307
  • Oops. I actually looked into this question, but I did not fing the recursion which also handles the files in the subdirectories. But yes, it is a duplicate. – tifrel Jul 08 '17 at 18:55

3 Answers3

6

You probably want to just use find. Assuming you don't have files with newlines in their names, just something like this would do:

find "$dir" -type f | wc -l

-type f matches regular files, but not directories, pipes, sockets or whatever.

The usual output of find separates the filenames with newlines, so if any of the names contain newlines, the output will be ambiguous. With GNU find, something like this would work:

find "$dir" -type f -printf . | wc -c

That has find print only a dot for each file, and counts the dots.

Other versions of find don't have -printf but we can use the trick of passing the input path with doubled slashes. They are treated like a single slash, but will not naturally appear otherwise in the output since file names cannot contain slashes. Then count the double-slashes in the output:

find "$dir//" -type -f | grep -c //

If we want to do that with purely a shell script, we can have the shell list the filenames, no need to use ls, e.g. in Bash:

#!/bin/bash
files=0
shopt -s dotglob
countfiles() {
        local f;
        for f in * ; do 
                if [ -f "$f" ] ; then          # count regular files
                        files=$((files + 1))
                elif [ -d "$f" ] ; then        # recurse into directories
                        cd "$f"
                        countfiles
                        cd ..
                fi
        done
}
cd "$1"
countfiles
echo $files
ilkkachu
  • 138,973
1

You can get the number of files in a directory and sub directories by using:

 find path_to_directory -type f | wc -l
0

Why does it always count the files +1 except for the master directory?

Cause ls -la also adds the string total 20 to the output. I can see that for "master" directory it also shows +1 value.

Why does it count anything in the '.config' directory, as there are no files in there?

The same reason. total .. string produced by ls.

How can I fix this?

Do not use your script :) I mean really, it's overcomplicated. We have nice find in here. All your script will turn into something like that (if you need files per directory):

find $yourdir -type d | while read dir ; do 
    echo "$dir == $(find $dir -maxdepth 1 -type f | wc -l) files" ; 
done

or just (if you need sum value):

find $yourdir -type f | wc -l
rush
  • 27,403