0

I have a need where I must work with a set of files based on time elapsed (as I do not have control over when new files are added to the remote directory).

I must do several operations with this set of files, therefore I am attempting to store find results in a variable (so I can be sure elapsed candidates don't change), then use the same list (without executing find again) for each operation.

This is a stripped down version of what I'm attempting.

FILELIST=$(ssh ${SSHUSER}@${SSHHOST} "cd ${REMOTEPATH}; find -amin +${TIME} -type f,d -print")

if [[ ${#FILELIST} -lt 4 ]]; then echo "No files" >> ${LOGFILE} exit 1 fi

echo -en "Copying remote files to remote backup\n" >> ${LOGFILE} ssh ${SSHUSER}@${SSHHOST} "cd ${REMOTEPATH}; cp -Rpv '${FILELIST}' ${REMOTEBACKUPPATH}" >> ${LOGFILE}

This of course gives me cp: missing destination file operand after '' which I understand. I simply don't know how to get the file list formatted in a manner that cp will like it.

I have worked on this a while, read documentation / SE posts, and cannot figure it out. I am hoping once I know what I'm doing wrong with cp I can also apply that knowledge to rsync.

I'm new to the bash scripting arena, if that is not not already clear. :)

Thanks in advance.

Spot
  • 101
  • 1
    Is it not possible to use the -exec option of find here? That would get rid of the intermediate step. – AdminBee Nov 18 '20 at 17:11
  • @αғsнιη Yeah, you're right. That is a change I made right before posting here (while I was testing something) and neglected to change it back. I've updated the post. – Spot Nov 18 '20 at 18:38
  • @AdminBee I have several different operations to do with this exact list of files, so I need to execute one find call, then run each operation using that same list. – Spot Nov 18 '20 at 18:39
  • Create a shell script on the remote host that does what you need, and just invoke that via ssh. – Andrew Henle Nov 18 '20 at 20:37

1 Answers1

5

bash is not the best shell to handle that. I'd recommend zsh instead (you're already using zsh syntax by not quoting some of your parameter expansions):

filelist=(
  ${(0)"$(
    ssh $SSHUSER@$SSHHOST "
      cd -- ${(qq)REMOTEPATH} &&
        find . -amin +${TIME} -type f,d -prune -print0"
  )"}
)

if (( $#filelist < 4 )); then print No files >> $LOGFILE else print "Copying remote files to remote backup" >> $LOGFILE print -rNC1 -- $filelist | ssh $SSHUSER@$SSHHOST " cd -- ${(qq)REMOTEPATH} && xargs -r0 cp -Rpvt ${(qq)REMOTEBACKUPPATH}" fi

(assuming $SSHHOST is a GNU system (you were already using some GNUisms yourself like f,d, no file passed to find, cp -v), and the login shell of $SSHUSER there is Bourne-like).

I've added -prune in the find command as you're going to copy those whole dirs any way, so there's no point looking for more files in there anyway. Also reading the contents of those dirs would update their last access time.

In any case, note that reading a file in a directory does not update the access time of a directory. The access time of a directory is updated when its own contents is read like when you use ls/find or shell globs in it (also note that this days, access times are not always maintained thoroughly). So it's not clear what you're trying to do exactly.

Above we use:

  • filelist=( values ) to define an array variable (where filelist=value would define a scalar variable instead).
  • ${(0)"$(cmd)"} splits the output of cmd on NULs. Using a NUL delimited list being the safe way to pass a list of file paths, as NUL is the only character not allowed in a file path.
  • cd -- ${(qq)REMOTEPATH}, with -- to guard against values of $REMOTEPATH that start with - (not - itself though nor -4/+3 in some shells), and the value of $REMOTEPATH (which is expanded by the local shell here) quoted (using single quotes with qq, the only safe way to do it) for the remote shell (assuming it's Bourne-like, that wouldn't work for some values of $REMOTEPATH if the login shell of $SSHUSER is csh/fish/rc...).
  • note the && which makes sure the next command is only run if cd was successful as otherwise you'd be running it in the wrong directory.
  • -print0: print the file paths NUL delimited (see above).
  • Then $#filelist will contain the number of matching files (which we compare to 4 like in your original code, not sure why).
  • Then we need to send that list back to the remote host. Instead of passing it as arguments in the shell code that it sent there for the remote shell (for which we'd run in all sorts of problems and limitations), we're passing it NUL delimited (with print -N) on ssh stdin, which is inherited by xargs there.
  • xargs reads that list and transforms it to arguments for cp -t target (-t being also a GNU extension). xargs will run as many cps as necessary avoiding the argument size limit.
  • Ah well this is definitely something I need to lookup a few things to really understand. Thank you! – Spot Nov 18 '20 at 19:25