I am hoping to achieve some cleanup functionality on about 20TB for my NAS with rsync in linux by excluding whole directories and contents for directories that would contain a ".protect" file
I generate really large caches in subfolders like
cache/simulation_v001/reallybigfiles_*.bgeo
cache/simulation_v002/reallybigfiles_*.bgeo
cache/simulation_v003/reallybigfiles_*.bgeo
and if a file existed like this- cache/simulation_v002/.protect
Then i'd like to build an rsync operation to move all folders to a temp /recycle location excluding cache/simulation_v002/ and all its contents.
I've done something like this before with python, but I'm curious to see if the operation can be simplified with rsync or another method.
rsync
alone can't do this - but you could usefind
to construct an exclude file for rsync. e.g. starting with something likefind . -name .protect -printf '%h/***\n'
– cas Jul 31 '19 at 03:32./simulation_v002/***
but this will then still end up including files it shouldn'trsync -a -m --remove-source-files --exclude-from='cache/exclude_list.txt' cache/ cache_trash
is it possible for find to generatesimulation_v002/***
instead? – openCivilisation Aug 02 '19 at 13:31sed -e 's=^\./=='
. don't expect one tool to do everything - it's normal to combine multiple small tools to achieve a desired result, each tool being good at its own job. find to get the list of files, sed to transform it into the required format, rsync to do th copy. – cas Aug 02 '19 at 14:28