I received a PDF in an email and I want to crop and rotate it because it has two pages per sheet. Trying the solutions in Split pages in pdf I had problems with "AssertionError" in pyPDF
and "Warning: stream operator not terminated by valid EOL." in ImageMagick. pdftk
seems to be stuck in an endless loop and never finishes processing the file.
Here's the pyPDF
error:
Traceback (most recent call last):
File "./un2up.py", line 48, in <module>
split_pages(sys.argv[1],sys.argv[2])
File "./un2up.py", line 14, in split_pages
for i in range(input.getNumPages()):
File "/usr/lib64/python2.7/site-packages/pyPdf/pdf.py", line 431, in getNumPages
self._flatten()
File "/usr/lib64/python2.7/site-packages/pyPdf/pdf.py", line 596, in _flatten
catalog = self.trailer["/Root"].getObject()
File "/usr/lib64/python2.7/site-packages/pyPdf/generic.py", line 480, in __getitem__
return dict.__getitem__(self, key).getObject()
File "/usr/lib64/python2.7/site-packages/pyPdf/generic.py", line 165, in getObject
return self.pdf.getObject(self).getObject()
File "/usr/lib64/python2.7/site-packages/pyPdf/pdf.py", line 647, in getObject
assert idnum == indirectReference.idnum
AssertionError
I tried opening it in Adobe Reader and saving a copy, but the file ended up the same.
The file opens fine for visualization on evince
, Adobe Reader and Google Drive.
Any idea how to fix the file so it can be read by pyPdf
?
Transfer corruption occured to me, but the file opens fine in all PDF viewers tested. That does not rule out corruption, but it would be REALLY bad luck to have the only bits that would make pyPDF go crazy flipped.
– Elton Carvalho Aug 22 '13 at 22:06