I have a folder with several files. These files are either .xml
or .zip
files.
These .zip
files contain .xml
and/or .zip
files. These .zip
contains also .xml
or .zip
, and so on... until we finally found .xml
files.
In others words, I can have several "levels" of zip before finding my .xml
files (cf. example below).
My requirement is to detect which root ZIP files contain at least one XML file that is bigger than 100Mb.
When a ZIP file is in such case, it should be moved to another directory (let say ~/big-files
).
Also, if a non zipped .xml
file is bigger than 100Mb, then it should be moved to this directory.
For example:
foo1.xml
foo2.xml
baz.xml [MORE THAN 100Mb]
one.zip
+- foo.xml
+- bar.xml [MORE THAN 100Mb]
+- foo.xml
two.zip
+- foo.xml
+- zip-inside1.zip
| +- bar.xml [MORE THAN 100Mb]
+- foo.xml
three.zip
+- foo.xml
+- zip-inside2.zip
| +- zip-inside3.zip
| +- foo.xml
| +- bar.xml [MORE THAN 100Mb]
+- foo.xml
four.zip
+- foo.xml
+- zip-inside1.zip
+- foo.xml
In this example, baz.xml, one.zip, two.zip and three.zip should be moved to ~/big-files
as they host at least one XML file bigger than 100Mb, but not four.zip.
How can I achieve that in bash shell?
Thanks.
find
does not look inside zip files. You will need to write a script to do this in a more powerful language (Python, Ruby, Perl, etc.) – James Youngman Jul 06 '12 at 14:47-exec sh -c "unzip -l $@ ... | grep" xxx
. That's a (small) shell script. – James Youngman Jul 07 '12 at 17:09