2

I have a bunch of tarball backups which I just restored onto my new Windows 8.1 + Cygwin system using GNU tar:

zsh$ for file in **/*.tgz; do tar xvzf $file; done

To my surprise a lot of these extracted files were corrupt. I tried replacing GNU tar with BSD tar and repeated the process, but the same files were still corrupt.

Then I tried extracting them with WinRAR, and they turned up just fine. Does anybody know what's going on?

  • 1
    Is there any pattern to the corruption (line endings, files too small, etc.)? Do you know what program created the archives (and on what platform)? – Mikel May 27 '14 at 05:33
  • Can you do a diff -u <(od -An -vtx1 < f1) <(od -An -vtx1 < f2) where f1 and f2 are the same file but extracted with tar and winrar? – Stéphane Chazelas May 27 '14 at 06:28
  • @StephaneChazelas actually the only corrupt files I can find are .otf and .mp3 so I'm not sure what good a diff would do. What I said to @Mikel earlier about the text files was a false alarm. – Mark Boulder May 27 '14 at 06:54
  • 2
    The tar format has gone through a few iterations and supports vendor specific tags/extensions. With what commands were these .tgz created in the first place? – Anthon May 27 '14 at 07:05
  • Just tar czf $file.tgz $folder – Mark Boulder May 27 '14 at 07:10
  • That's a diff on the output of od to see what bytes differ, but I forgot the -w1 option to add to od. – Stéphane Chazelas May 27 '14 at 07:32
  • @StephaneChazelas Hi! Unfortunately that command does not produce any output: diff -u <(od -An -vtx1 < garamond_premier_pro.otf) <(od -An -vtx1 < garamond_premier_pro_corrupt.otf) – Mark Boulder May 27 '14 at 14:44
  • That would seem to indicate the files are identical. Does cmp -l file1 file2 give anything? Or possibly cygwin reads the files in some way that eol characters are converted on the fly so it can't detect the difference. – Stéphane Chazelas May 27 '14 at 14:53
  • cmp returns nothing as well. In Windows the first file opens fine, the second one returns: The requested file is not a valid font file. – Mark Boulder May 27 '14 at 14:56
  • Is there a CYGWIN env var? – Stéphane Chazelas May 27 '14 at 14:58
  • There's a bunch of them, why? – Mark Boulder May 27 '14 at 14:59
  • @StephaneChazelas do you know a way I can diff whole folders (ie. extracted_tar and extracted_winrar) with all their content? – Mark Boulder May 28 '14 at 02:08
  • @MarkBoulder: It's not diff, but the first option in this answer should be helpful. How exactly are you determining that these extractions are "corrupt"? – Warren Young Jul 22 '14 at 14:16

0 Answers0