Here is a very simple two-step process to do exactly this.
First, use find
to generate the list of all files that should end up archived. Use sed
to generate the archive name for each. Filter the output through sort
and uniq
to ensure you have the names for all archives you need. For example:
find . -name '[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]_*' -printf '%f\n' | sed -e 's|_.*$||g' | sort | uniq
Note that we use %f
format above, to get the file names only, not the full paths.
Next, we pipe that through a small bash loop that reads each archive name, using find
again to find all log files, piping that list to tar
generating the archive.
For running such commands, I like to ensure we are using the C/POSIX locale (no localized error messages or other formatting). That is done by setting LANG
and LC_ALL
environment variables to C
. So, the entire command sequence I'd use is
export LANG=C LC_ALL=C
find . -name '[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]_*' -printf '%f\n' | sed -e 's|_.*$||g' | sort | uniq | while read NAME ; do
find . -name "${NAME}_*.log" -printf '%p\n' | tar -cJf "${NAME}.tar.xz" -T - --no-unquote
done
The -J
parameter in -cJf
refers to XZ compression (it is fast and good, you probably do want that); I like to read -cJf
as "create XZ archive file". The -T -
means files in each archive are supplied from standard input, and --no-unquote
means the file names are raw, not quoted.
Note that the pattern of the archive names is very suitable for globbing here. (That is, that we can supply it to find -name ...
.) If the pattern contained *
, ?
, [
, or ]
, we'd need to escape them. Doable, but annoying. The OP has chosen the filename pattern extremely well, in my opinion.
2016
,followed by a-
, which is followed by a 2 digit month, followed by a 2 digit day of the month, followed by string_log
. Otherwise, you can not filter out the files that you want. Otherwise your question is too vague – MelBurslan Apr 01 '16 at 13:36