12

I have this:

-rw-r--r--  1 user user 36166999908 Jan 29  2022 tmp.archive.part1.zip
-rw-r--r--  1 user user  5579574562 Jan 29  2022 tmp.archive.part2.zip
-rw-r--r--  1 user user  5097536636 Jan 29  2022 tmp.archive.part3.zip
-rw-r--r--  1 user user 10612382236 Dec 29 02:19 tmp.archive.part4.zip 
                          G  M  k    

so these ZIP files are 36 GB, 5, 5, and 10 GB in size, all of them would be past the 2^32 4GB maximum that I read at one place. They say "zip64" allows 2^64 size, but I don't know what I have, zip -h says:

Copyright (c) 1990-2008 Info-ZIP - Type 'zip "-L"' for software license.
Zip 3.0 (July 5th 2008). Usage: ...

and file tells me:

file tmp.archive.part1.zip
tmp.archive.part1.zip: Zip archive data, at least v1.0 to extract

so how can that be?

I do notice that zipmerge completely fails to operate with these files.

My problem is, I need to combine these zip files into one (if possible) and do so without actually extracting them (no space and file count quota on the system this is on). I tried a zip2tar python script which someone posted here to another question but that also fails. They don't like the file, say it's no zip file, or just crash with core dump.

If these zip files were created with the zip 3.0 which I showed, then is there perhaps a better zipmerge or something which will not choke on the size?

  • 6
    It would be very, very unlikely that a current version of a compression program couldn't have archives > 4GB. If I decide to archive my video folder then I expect that to work, even if it is 200GB. – gnasher729 Dec 30 '22 at 00:35

3 Answers3

28

Because there is more than one type of ‘ZIP’ archive.

The original ZIP format, as implemented by the first version of PKZIP, did indeed have a 4 GiB limit on archive size (as well as a corresponding limit on archive member sizes, both compressed and uncompressed). However, with version 4.5 of the format, the ZIP64 extensions were introduced, which extended this limit to 16 EiB by moving the relevant fields in the file header and archive entries to supplementary fields stored elsewhere in the archive, as well as expanding the limit on the number of archive members (classic ZIP was limited to 65535 archive members) in a similar manner.

However, unless a tool is actually looking for those extended fields, they get ignored and the tool will just fail to work correctly. This is because a ZIP64 archive is still technically a valid ‘classic’ ZIP archive unless you try to validate the member sizes (and this is an excellent example of why backwards compatibility can sometimes be a bad thing).


Possibly of note, there are actually a lot of other potential incompatibilities in the ZIP format. Of particular note, there are multiple incompatible encryption mechanisms that can be used with ZIP archives, and almost a dozen different compression algorithms with most implementations not supporting all of them (though you have to go out of your way to use something other than ‘Store’, ‘Deflate’, or ‘Deflate64’, and those are supported by pretty much everything).

13

The 4GB limit was raised with Info-ZIP 3.0 which was the first version to have ZIP64-support, it's currently the latest official supported version and as you can see it is almost 15 years old.

stoney
  • 1,055
3

Interesting question!

An easy web search returns this interesting document [1];
and indeed there is more than a ZIP revision; I didn't know about it;
Honestly, not being an expert, I can only guess this information are nowadays forgotten/buried behind the sake of simplicity: original ZIP revision was proposed and designed for old (today) devices/software but now, thanks to computer science/devices advancements, there is no need to take care of different (older) revisions;

I think that, if there is no tools/commands for discovering this rare information, the only (and hardest) way is to dig directly and by hand, in the binary structure, [2]

Get back to the existence of a tool, I have found zipdetails [3], that is (counterintuitively) part of the perl package, which could help you or at least ease your job!


[1] https://peazip.github.io/rar-zip-file-format-size-limitations.html
[2] https://en.wikipedia.org/wiki/ZIP_(file_format)#Structure
[3] https://perldoc.perl.org/zipdetails

mattia.b89
  • 3,238
  • 4
    As an alternative to digging into the binary structure, you can also read the man page which says that the zip64 extension is used to allow archives larger than 4GB. – doneal24 Dec 29 '22 at 18:26