Bug 1226260
Summary: | cannot write some non ascii characters in editable PDFs | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | janez.kosmrlj | ||||||||
Component: | poppler | Assignee: | Marek Kašík <mkasik> | ||||||||
Status: | CLOSED WONTFIX | QA Contact: | Desktop QE <desktop-qa-list> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | unspecified | CC: | caolanm, cww, janez.kosmrlj, mkasik | ||||||||
Target Milestone: | rc | Keywords: | Reopened, Triaged | ||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2021-12-01 07:27:13 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1420851 | ||||||||||
Attachments: |
|
Created attachment 1031846 [details]
view
Hi, could you attach such a PDF here so I can have a look at it (especially at fonts specified in it). Created attachment 1045509 [details]
example document
I attached a pdf i found on the internet (example document). I could also send you a document we use in our company, but don't want to publish this on the internet. If you wish, i could email it to you. Thank you for the PDF. Unfortunately this is a problem in the PDF specification itself. If the application which creates the PDF specifies a simple font (Type1 fonts, Multi Master fonts, TrueType fonts and Type3 fonts - according to PDF spec) to use in the text field then we have only 256 characters available (the set depends on used encoding). Unfortunately, some of the accented characters are not there so they are not shown (in the standard encodings used in PDFs). The fact that you see them when entering them into the field is caused by the fact that evince handles the widgets itself when edited (you can see that the font itself can differ). Even Adobe Reader doesn't work for me on windows and linux here (but for me the 'š' character doesn't show up - it probably depends on set of fonts available). My findings agrees with this comprehensive comment quite well (although it is more about direct editing of PDF but it is similar to this situation): http://stackoverflow.com/questions/15964704/java-pdfbox-reading-and-modifying-a-pdf-with-special-characters-diacritics/15973614#15973614 Maybe using a CID font with a comprehensive CID to GID mapping would help but this needs to be done by the application creatng the PDFs (I'll check whether it would work this way in poppler/evince). I'm giving this bug devel_ack- since PDF specification does not specify how to handle this situation. I've filed a bug for a related issue I've found during looking at this problem, you can find it here: https://bugzilla.redhat.com/show_bug.cgi?id=1298616. Regards Development Management has reviewed and declined this request. You may appeal this decision by reopening this request. when can we expect a fix Hi, I'm moving this to Red Hat Enterprise Linux 8. I'm working on this feature but I have not finished it yet. It also needs to be accepted by upstream then so I'm not giving a deadline now. Regards Marek After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. I'm moving this bug to RHEL 9. After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. |
Created attachment 1031844 [details] input field Description of problem: When i get a PDF which has input fields i can't write the Slovene character č (unicode 268 and 269). It is displayed in the input (pic1.png), but when i move on to the next field it is not visible anymore(pic2.png). When i try to edit the same field i see that the character is still there it' just not displayed. For this reason i still have to use Adobe Reader which is not supported anymore. Version-Release number of selected component (if applicable): 3.8.3 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: