1

This question and answer (GZip doesn't produce the same compressed result on macOS vs Linux) are pretty clearly the GZIP doesn't meet the bill. No argument there.

But, I think the real question lurking behind that question is: Are there any compression algorithm that deterministically create the same binary file on Linux and Mac?

  • cat? 0% compression is abysmal, though. probably not worth bothering with. – cas Jul 04 '22 at 02:27
  • Why you need that? I feel a strong smell of an XY problem. You should simply never rely on what exactly a compression program produces, if after decompression the data is the same as that was compressed. What is the real problem you are going to solve? – Nikita Kipriyanov Jul 04 '22 at 05:08
  • There's a reason. I have a collection of about 5,000 binary files that, when compressed exhibit about a 40% reduction -- 100 GB to about 60GB. So compression is highly desired. A more pressing requirement, however, is that the files (either compressed or uncompressed) can be added to a content-addressable store such as IPFS -- read about it -- I wish to produce identical bit-by-bit files on different operating systems so they create identical bit-by-bit hashes when stored on a content-addressable store. Perhaps your sense of smell is a bit off? But thanks for the all-knowing question. – Thomas Jay Rush Jul 05 '22 at 07:44

1 Answers1

2

I’ll restrict this to “compression tools that create identical files on Linux and Mac”. I know of at least one, Lzip, which is explicitly designed for reproducibility (including across platforms); in particular,

The lzip format does not store any metadata of the uncompressed file, except its size. Therefore, lzip files are reproducible; lzip produces identical compressed output from identical input.

Stephen Kitt
  • 434,908