Bug 2119762 - Evince does not use utf-8 in search strings
Summary: Evince does not use utf-8 in search strings
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: evince
Version: 37
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
Assignee: Marek Kašík
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: F37FinalBlocker
TreeView+ depends on / blocked
 
Reported: 2022-08-19 10:53 UTC by Lukas Ruzicka
Modified: 2022-08-22 17:07 UTC (History)
10 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2022-08-22 17:07:09 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
See how searches are treated. (515.67 KB, image/png)
2022-08-19 10:53 UTC, Lukas Ruzicka
no flags Details


Links
System ID Private Priority Status Summary Last Updated
GNOME Gitlab GNOME evince issues 1839 0 None opened Evince seems not to use the utf-8 encoding in searches. 2022-08-19 10:54:11 UTC

Description Lukas Ruzicka 2022-08-19 10:53:44 UTC
Created attachment 1906523 [details]
See how searches are treated.

Description of problem:
Evince does not use utf-8 in search strings and therefore is unable find occurences in language using non-ascii characters.

Version-Release number of selected component (if applicable):
evince-43~alpha-4.fc37.x86_64 

How reproducible:
Always

Steps to Reproduce:

The latest version of Evince on Fedora seems not to be using utf-8 encoding in searches which limits the search possibilities in all languages that use non-ascii characters. The following examples are made on a Czech system.
Reproducer (you can see the illustration below):

* Open the search bar (Ctrl-F).
* Type řekla (meaning [she] said)
* Notice, that řekla has not been found and is indicated by red color.
* Notice, that if Julie is found instead, there is a occurence of řekla Julie (Julie said), however the leading character has not been correctly recognized and its representation in the search results is incorrect.
* In the text itself, all characters are correctly shown.
* When I copy the text using Ctrl-C, I am getting øekla Julie instead of řekla Julie.

I believe that the strings might not be treated as utf-8 in places lacking the correct characters. It would be nice if the application would be able to use correct encoding even in searches and copied out strings.

Actual results:
Incorrect search results for non-ascii languages.

Expected results:
Searches should be possible even for different characters.

Additional info:
Also reported upstream: https://gitlab.gnome.org/GNOME/evince/-/issues/1839

Comment 1 Fedora Blocker Bugs Application 2022-08-19 10:55:47 UTC
Proposed as a Blocker for 37-final by Fedora user lruzicka using the blocker tracking app because:

 I am proposing this to for a discussion about the problem being blockery in the scope of Basic Functionality.

Comment 2 Kamil Páral 2022-08-19 11:24:13 UTC
Let's have the conversation in upstream, so that we don't split it into several places. I added a comment there.

Comment 3 Adam Williamson 2022-08-22 17:07:09 UTC
Per upstream discussion, this turned out to be a bug in the PDF file, not in Evince. Acrobat also can't find the string.


Note You need to log in before you can comment on or make changes to this bug.