Bug 1592820 - Pdfshuffler suffers from bugs in pyPdf, update to PyPDF2 recommended
Summary: Pdfshuffler suffers from bugs in pyPdf, update to PyPDF2 recommended
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: pdfshuffler
Version: 29
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Fabian Affolter
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1648514 1708993
TreeView+ depends on / blocked
 
Reported: 2018-06-19 11:30 UTC by David Auer
Modified: 2019-10-28 22:27 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-28 22:27:22 UTC


Attachments (Terms of Use)
Test file to reproduce this bug. Exportet from Wikipedia (190.66 KB, application/pdf)
2018-07-27 13:42 UTC, David Auer
no flags Details

Description David Auer 2018-06-19 11:30:12 UTC
Description of problem:
Pdfshuffler fails on certain Pdfs (see upstream bug for an example)

Version-Release number of selected component (if applicable):
0.6.0-11.fc27

How reproducible:
Always

Steps to Reproduce:
1. Load PDF
2. Export PDF

Actual results:
Exported PDF is blank or an empty Error message is shown and no file created (depending on the input PDF)

Expected results:
Exported PDF should look the same as imported PDF.

Additional info:
Upstream says using PyPDF2 should fix this: https://sourceforge.net/p/pdfshuffler/bugs/22/

Comment 1 David Auer 2018-07-27 13:42:15 UTC
Created attachment 1471115 [details]
Test file to reproduce this bug. Exportet from Wikipedia

On export Pdfshuffler fails with errormessage "multiple definitions in dictionary"

Comment 2 Fedora Update System 2018-09-10 11:14:52 UTC
pdfshuffler-0.6.0-15.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2018-cb2b2c4f6b

Comment 3 Fedora Update System 2018-09-10 11:20:35 UTC
pdfshuffler-0.6.0-15.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-93017c1b88

Comment 4 Fedora Update System 2018-09-10 11:27:47 UTC
pdfshuffler-0.6.0-15.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-d4c9a0d98c

Comment 5 Fedora Update System 2018-09-11 06:14:25 UTC
pdfshuffler-0.6.0-15.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-cb2b2c4f6b

Comment 6 Fedora Update System 2018-09-11 15:44:00 UTC
pdfshuffler-0.6.0-15.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-d4c9a0d98c

Comment 7 Fedora Update System 2018-09-11 18:11:17 UTC
pdfshuffler-0.6.0-15.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-93017c1b88

Comment 8 David Auer 2018-09-12 15:04:20 UTC
I believe the referenced statement from upstream is wrong and PyPDF2 does not fix the bug.
Additionally the update is broken because it only changed the dependency, not the relevant python code. See my comment in Bodhi for details: https://bodhi.fedoraproject.org/updates/pdfshuffler-0.6.0-15.fc27#comment-833396

I only tested and commented on F27 but I assume the same will be the case on 28 and 29 so I'd recommend to disable autopush for those.

Thanks for trying to fix this but I guess there is work to be done in upstream.

Comment 9 Jason Tibbitts 2019-01-10 19:56:22 UTC
The patch from upstream is https://sourceforge.net/p/pdfshuffler/code/90/tree//trunk/pdfshuffler/pdfshuffler.py?diff=51a4fa945fcbc945034d6ac8:89

I have pulled this into pdfshuffler and did a local build; it works fine.  I could go ahead and push this as an update, but I have a question.  The upstream implementation first tries to load pypdf and then pypdf2.  So if the old pypdf package is still around on the system, you'll still get the old implementation and must manually uninstall pypdf.

Alternately, we could deviate from upstream and unconditionally load pypdf2, or load it first.  I'm not sure what makes the most sense.

Comment 10 David Auer 2019-01-13 19:21:41 UTC
Yes, I totally agree that loading pypdf with priority over pypdf2 does not make sense since people might have both installed for various reasons.

I would make it Pypdf2 only since this is required in the spec file and I don't see an advantage in some switching logic that could silently load the old Pypdf due to a typo or something.

Comment 11 Owen Taylor 2019-01-15 20:45:50 UTC
(In reply to David Auer from comment #10) 
> I would make it Pypdf2 only since this is required in the spec file and I
> don't see an advantage in some switching logic that could silently load the
> old Pypdf due to a typo or something.

Either hard-coding PyPDF2 or reversing the preference it would work for the Fedora package. If sending the patch upstream, reversing the preference might be easier to get in.

Comment 12 David Auer 2019-05-15 14:08:16 UTC
@Jason

Comment 13 David Auer 2019-05-15 14:13:18 UTC
Oh, sorry about the last comment. 

@Jason Tibbits: Could you go ahead and push your changes as an update? PyPDF is not available in Fedora 30 therefore pdfshuffler is currently broken. I wouldn't worry about getting the patch into upstream in this case.

Comment 14 Jason Tibbitts 2019-08-21 02:24:53 UTC
I came back to this after running into the fact that pdfshuffler is indeed completely broken in F30.  I filed https://src.fedoraproject.org/rpms/pdfshuffler/pull-request/2 and will give the maintainers a few days before I go ahead and merge it.

Comment 15 Ivan Virgili 2019-08-24 07:38:06 UTC
If I install pyPdf-1.13-16.fc29.noarch.rpm on F30 then pdfshuffler works normally.

We need pyPdf back on F30, or pyPDF2
pyPdf got retired just a few months ago: https://bugzilla.redhat.com/show_bug.cgi?id=1676853

Replacing in pdfshuffler.py

from pyPdf import PdfFileWriter, PdfFileReader

with

from PyPDF2 import PdfFileWriter, PdfFileReader

doesn't solve the issue, because pdfshuffler fails to import pyPDF2.

Comment 16 Jason Tibbitts 2019-08-26 17:04:43 UTC
I'm running with the package plus my pull request and it does work with no need to install pypdf.  It's not coming back anyway.

However, pdfshuffler is doomed unless it gets ported to python3.  The rest of the python2 is going away, and even the current SVN head code is not python3 compatible.  I did a naive pass over the code and it runs to the point of opening a window and loading a file but doesn't do anything useful.

Comment 17 David Auer 2019-09-03 13:28:56 UTC
Pdfarranger is pdfshuffler ported to Python 3 plus bugfixes and new features. It's already reviewed and should be imported soon. I just wonder if there is a way to inform users of pdfshuffler that they should give pdfarranger a try.

Comment 18 Jason Tibbitts 2019-09-03 16:20:34 UTC
I had no idea.  Since pdfshuffler will have to leave anyway, it might be reasonable for the new pdfarranger package to obsolete pdfshuffler, and to provide a symlink for the binary.

Comment 19 Fabian Affolter 2019-10-28 22:27:22 UTC
pdfarranger is the future. pdfshuffler will be removed.


Note You need to log in before you can comment on or make changes to this bug.