1

With this command, we can recursively unzip archives in a directory and its sub-directories and remain its structure in the current working directory.

find ../backUp/ -name "*.zip" | 
  xargs -P 5 -I fileName sh -c '
    unzip -o -d "$(dirname "fileName")/$(basename -s .zip "fileName")" "fileName"
  '

But when I run it, all the unzipped folders keep in the original directory. Can we hard-code basename and dirname in the bash environment?

adding example:

/backUp/pic1/1.zip
/backUp/pic2/2.zip
/backUp/pic3/3.zip

Goal:
/new/pic1/1-1.png /new/pic1/1-2.png
/new/pic2/2-1.png /new/pic2/2-2.png
/new/pic3/3-1.png

Maxfield
  • 161
  • 2
    Side note: embedding fileName in shell code after xargs -I fileName is like embedding {} after xargs -I {}. Don't. – Kamil Maciorowski Feb 21 '22 at 05:04
  • Is your question, Can I change "$(dirname "fileName")" to "hard-coded-name"? – ctrl-alt-delor Feb 21 '22 at 09:25
  • so if there's the zip file foo/bar.zip with a file dir/hello.txt, this should extract the file to foo/bar/dir/hello.txt ? – ilkkachu Feb 21 '22 at 09:42
  • 1
    Your question is not clear to me. Please [edit] it and show an example. What .zip files in what directories do you have what files are in the .zip files, what result do you get and what do you want to get instead? Or what else is your question about basename and dirname? It would be possible to replace the dirname/basename combination with other commands. – Bodo Feb 21 '22 at 09:54
  • BTW: running 5 unzips in parallel is probably not going to give you the performance improvement you expect. While decompression algorithms are computationally intensive and do benefit from parallel execution, actually reading the .zip files and writing the files they contain to disk will not benefit at all from parallelisation....in fact, it will run slower due to the added contention for disk I/O. This will be the case even on SSDs and NVME drives, but especially so on HDDs. I/O is the bottleneck for a job like this, not CPU power. – cas Feb 21 '22 at 11:38
  • @Bodo, just added an example. Thank you. – Maxfield Feb 21 '22 at 21:49

1 Answers1

2

You need to use find's -execdir option. This will cd, in turn, to each directory containing matching files ("*.zip") and execute the command you specify, passing the matching filenames in that particular directory as arguments to the command.

e.g. something like:

find . -name '*.zip' -execdir sh -c 'for f; do unzip -o "$f"; done' find-sh {} +

Note: since we're using sh -c as the command, we need to pass it a dummy argument as the first (zeroth) argument. This will be the name of the process in the kernel's process table (here, I'm using find-sh as the name). The remaining args, {}, are the filenames. These can be accessed in the sh -c script in the usual way, as $1, $2, $3, $@, $*, etc....in this example, I'm iterating over them all with for f; do ... ; done, which is short-hand for for f in "$@"; do ... ; done

BTW, if you want to see how it works, you can use echo instead of unzip. e.g. the following will print the name of each directory that contains at least one .zip file, followed by a colon and the list of .zip files in that directory. Output will be one line per directory.

find . -name "*.zip" -execdir sh -c 'echo "$PWD:" "$@"' find-sh {} +
cas
  • 78,579
  • NOTE: the -execdir option requires either the GNU or BSD version of find. These are standard on Linux distros (except tiny distros using busybox's implementation of find), the various BSD distributions and, IIRC, Macs. This option is not* part of the POSIX spec for find and probably won't work on ancient or proprietary versions of unix (unless GNU or BSD find has been installed). – cas Feb 21 '22 at 11:29
  • checkdir error: cannot create [img] Input/output error – Maxfield Feb 23 '22 at 05:47
  • do you have write perm on the dir or the file? does the file already exist but is owned by another uid? is there enough free space and inodes in the fs? – cas Feb 23 '22 at 07:09