Shell Script to isolate files scattered in different folders

Question

I need to write a shell script that traverses a given directory structure in multiple levels, recognizes the video files scattered in the folders and copies all of them to another single folder. The video files are really Macromedia Flash data files (saved in my browser's cache) and they do not have any particular extension like .MP4 or .FLV just aplhanumeric names.

Now, I managed to make a start:

#!/bin/bash
#This script checks if the given folder contains video file and copies them another folder
filePath=$1                 #the argument(The folder Path)
if [[ -z "$filePath" ]]; then  #empty argument check
    echo "Please input the folder to search"
    exit 1
fi
if [ -d "$filePath" ]; then 
 if [ -L "$filePath" ]; then #check if the input folder is a valid folder
  echo "directory $file does not exist"
  exit 1
 fi 
else  #improper input given by the user     
 echo "$filePath is either not a valid directory or it is a file.Please enter a valid directory address" 
 exit 1
fi
find $filePath type -d | \ #the find here list all the files and directories in the given folder
while read i 
do
    if [[ -d "$i" ]]; then
        idDirectory    #Call the function
    elif [[ -f "$i" ]]; then
#check if its the file in the expected format
#do the copy work
    fi
done
exit 0


idDirectory(){ # A recursive function to go multiple levels of the directory tree
    #the function which tells if the path is directory or file
    #If its a directory then it will call itself, till it gets a file
    #need some help here!!!!!
}

I need to write the recursive function that identifies and traverses directories.I tried this. but it was not successful.

In same script I simply used recursion: when script meets directory, it calls itself with new path. — Eddy_Em, Feb 13 '14 at 19:14
The Hell!!! why did'nt i think of that. I'm a bad programmer — Aditya Cherla, Feb 13 '14 at 19:20
@Eddy_Em: Out of curiosity- Suppose there are hundreds of folders and when the script calls itself it'll make another instance of itself running parallely is it possible there might be some effect on the system? I mean performance wise — Aditya Cherla, Feb 13 '14 at 19:25
There could be only one problem: end of PIDs. You will see an error message in that case. But I don't think that you have somewhere a thousands of nested directories! — Eddy_Em, Feb 13 '14 at 19:27

score 4 · Accepted Answer · edited Apr 13 '17 at 12:36

Some general shell programming principles:

Always put double quotes around variable substitutions (unless you know that you need the unquoted behavior). "$foo" means the value of the variable foo, but $foo outside quotes undergoes further processing.
The same goes for command substitutions: "$(foo)".
while read i; do … strips off leading and trailing whitespace and backslashes. Use while IFS= read pr i; do … to process lines exactly.

find $filePath type -d only lists directories.

Piping find into while read isn't the easiest or most robust way of executing a shell command for each file. Use the -exec action in find to process files robustly without getting into trouble with file names containing special characters such as whitespace. If you need more than one command, invoke sh; to process a file at a time:

find … -exec sh -c 'command1 "$0"; command2 "$0"' {} \;

To speed things up a little, you can use a more complex idiom which groups files to reduce the number of successive shell invocations:

find … -exec sh -c 'for x; do command1 "$x"; command2 "$x"; done' _ {} +

All the checks at the beginning of your script are useless or mostly useless. You don't care if the argument is a symbolic link, for example, and your error message in that case is incorrect.

I assume that your script is called with two arguments, the source directory and the destination directory. You can use mkdir -p to create target directories as needed. It helps to run find on the current directory, to avoid having to do file name manipulation to compute target paths.

Call file to check the type of a file based on its content. You may want to tweak the file formats that are accepted.

cd "$1" || exit
find . -type f -exec sh -c '
  # $0 = target directory; $1 = source file
  case "$(file "$1");" in
    video/x-flv\;*)
      mkdir -p "$0/${1%/*}"    # create target subdirectory if needed
      cp -p "$1" "$0/$1"       # copy the file under the same relative path
  esac
' "$2" {} \;

@ Gilles : Thanks for showing the proper way! – Aditya Cherla Feb 14 '14 at 02:30 — Aditya Cherla, Feb 14 '14 at 02:30

score 1 · Answer 2 · edited Apr 13 '17 at 12:36

Here is a script with a recursive function, which does what you were looking for:

#!/bin/bash

if [ "$#" -ne 1 ]; then
   echo "usage: "$0" <search_dir>"
   exit 1
fi

# recursive function to walk a directory tree
walktree() {
  # $1 is the function parameter
  local files=$(find "$1" -maxdepth 1 -type f)          # find all files in the current dir

  for file in $files; do                                # iterate through all found files
    if [[ "$(file "$file")" == *video/x-flv* ]]; then   # check the file content
      echo "found one"
      # put your copy code here
    fi
  done

  local subdirs=$(find "$1" -maxdepth 1 -type d)        # find all sub-directories

  for dir in $subdirs; do                               # iterate through all sub-directories
    if [ "$dir" != "$1" ]; then                         # omit the current directory
      walktree "$dir"                                   # call the recursive function
    fi
  done
}

walktree "$1"                                           # first call with <search_dir>

However, I would still recommend you to just use find . -type f as explained by Gilles.

Thanks! i really wanted a good example of recursive function — Aditya Cherla, Feb 14 '14 at 02:28
@AdityaCherla if you find an answer useful, upvote it. That's the way to say "thanks" on these sites. We try to avoid unnecessary comments. — terdon, Feb 14 '14 at 14:44

Shell Script to isolate files scattered in different folders

2 Answers2