Possible Duplicate:
How to run a command when a directory's contents are updated?
I'm trying to write a simple etl process that would look for files in a directory each minute, and if so, load them onto a remote system (via a script) and then delete them.
Things that complicate this: the loading may take more than a minute. To get around that, I figured I could move all files into a temporary processing directory, act on them there, and then delete them from there. Also, in my attempt to get better at command line scripting, I'm trying for a more elegant solution. I started out by writing a simple script to accomplish my task, shown below:
#!/bin/bash
for i in ${find /home/me/input_files/ -name "*.xml"}; do
FILE=$i;
done;
BASENAME=`basename $FILE`
mv $FILE /tmp/processing/$BASENAME
myscript.sh /tmp/processing/$BASENAME other_inputs
rm /tmp/processing/$BASENAME
This script removes the file from the processing directory almost immediately (which stops the duplicate processing problem), cleans up after itself at the end, and allows the file to be processed in between.
However, this is U/Linux after all. I feel like I should be able to accomplish all this in a single line by piping and moving things around instead of a bulky script to maintain.
Also, using parallel to concurrent process this would be a plus.
Addendum: some sort of FIFO queue might be the answer to this as well. Or maybe some other sort of directory watcher instead of a cron. I'm open for all suggestions that are more elegant than my little script. Only issue is the files in the "input directory" are touched moments before they are actually written to, so some sort of ! -size -0 would be needed to only handle real files.