2

I have a directly full of images and I am trying to put together a script/command that will randomly copy (with some probability) the image to a location with a random name at that destination (because I might just want to copy it in place and not collide with the existing file). Also complicating matters is that the file has spaces in it and I have 30 gb of files I am working with

Here is what I have so far. Those file spaces are a killer

#!/bin/bash

for i in $(find pics/ -type f);  do
        v=$(($RANDOM % 2))
        if [ $v -eq 0 ]; then
                cp $i dups/$RANDOM.jpg;
        fi
done

I would eventually like something like:

./rcp.sh source/ destination/

I have looked at

shuff

but it doesn't get me past my space-in-file name issues either. Maybe there is a way to take this and have it also do the shuffle?

3 Answers3

2

The way to handle files with spaces is to use the -print0 directive for GNU find, and the -d option for bash's read command. It's also imperitive to quote the "$variable"

find pics/ -type f -print0 | while IFS= read -rd "" filename; do
    v=$((RANDOM % 2))
    if (( v == 0 )); then
        cp "$filename" dups/$RANDOM.jpg
    fi
done

The IFS= and -r bits are to ensure that spaces and backslashes are handled properly by the read command.

Inside (( ... )) arithmetic expressions, you can give shell variables without the $.

glenn jackman
  • 85,964
2
RAND_FILE=$( find pics/ -type f -print0 | shuf -n 1 -z )
# TODO check that RAND_FILE actually got a file, e.g. what
# if pics/ dir is empty, what happens?
cp "$RAND_FILE" ...

Though hard linking the copy would save space if it's on the same filesystem and the duplicate file will not be modified.

thrig
  • 34,938
0

Based on Glenn's input, I have:

#!/bin/bash

if [ ! -d "$2" ]; then
    mkdir -p $2 
fi

find $1/ -type f -print0 | while IFS= read -rd "" filename; do
    v=$((RANDOM % 4))
    if (( v == 0 )); then
        cp "$filename" $2/`uuidgen`.jpg
    fi
done

I added

uuidgen

because $RANDOM didn't supply a large enough number space to eliminate collision. One way to make this script better would be to use a number as probaility percentage and not simply as a fraction of 1/n (only 1/n numbers will be $RANDOM % n == 0)