Bug 1625313 - stderr messages (Textsplit UTF-8 errors) when hovering on the results
Summary: stderr messages (Textsplit UTF-8 errors) when hovering on the results
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: recoll
Version: 28
Hardware: Unspecified
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Terje Røsten
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-09-04 15:56 UTC by w247
Modified: 2018-09-28 18:20 UTC (History)
3 users (show)

Fixed In Version: recoll-1.23.7-8.fc29 recoll-1.23.7-8.fc27 recoll-1.23.7-8.fc28
Clone Of:
Environment:
Last Closed: 2018-09-28 16:56:45 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
reproducer (10.00 KB, application/x-tar)
2018-09-16 19:55 UTC, w247
no flags Details
Patch to fix the error messages about non UTF-8 URLs (1.51 KB, patch)
2018-09-17 08:27 UTC, Jean-Francois Dockes
no flags Details | Diff

Description w247 2018-09-04 15:56:40 UTC
Description of problem: When hovering on the results list, stderr is filled with text handling error messages. They seem to be harmless and not to impact functionality.


Version-Release number of selected component (if applicable): 1.23.7


How reproducible: always


Steps to Reproduce:
1. Type recoll in a terminal.
2. Enable "show results in a spreadsheet-like table" with the toolbar button at the right.
3. Type something in the search box and press enter, so that a few results are displayed.
4. Hover with the mouse on the results.

Actual results: whenever the mouse enters a cell, a bunch of lines like this are printed on stderr:

:2:common/textsplit.cpp:533::Textsplit: error occured while scanning UTF-8 string

Expected results: no error messages.


Additional info:

Comment 1 Jean-Francois Dockes 2018-09-05 07:03:05 UTC
Hi and thanks for reporting this.

This is probably caused by some non utf-8 data which somehow got into the index.

This seems dependant on the data, and I can't reproduce it. 

Would it be possible for you to share one of the documents which is causing this ? 

At least please indicate the type of document.

What is your locale value (output of the 'locale' command) ?

Comment 2 w247 2018-09-16 19:55:54 UTC
Created attachment 1483778 [details]
reproducer

Comment 3 w247 2018-09-16 19:59:53 UTC
Hi, thanks for the input. Sorry, it is not always:

If the search result path has an accented vowel from ISO Latin-1,
- stderr has logs,
- the URL in the results table is truncated at the special character,
- the URL in the HTML preview is full.

The locale is it_IT.UTF-8, it also happens with en_US.UTF-8.
The files were copied from Windows XP (Italian).

To rule out the content indexing: It does not happen if those documents
are renamed to UTF-8 (keeping their Latin-1 contents).

The context menu items are functional: preview/open, find similar docs, etc

Attached example:
- the archive contains a file named Facolt, followed by 0xE0 (Latin-1 for
  à), followed by .txt.
- Please unpack, add directory to the index configuration (stemming language:
  got the same result with english and italian), update index
- search for: biblioteca
- The URL column ends with: Facolt
- The URL in the HTML details ends with: Facolt%E0.txt

Comment 4 Jean-Francois Dockes 2018-09-17 08:26:24 UTC
Thanks a lot for taking the time to qualify the problem, and making it easy for me to fix it.

This will be coirrected in the next release. Meanwhile, the attached patch can be applied to Recoll 1.23.x or 1.24.x source for a fix. 

Note that there will still be some error messages, but at level 4 (debug), so they will not appear in the default configuration.

Comment 5 Jean-Francois Dockes 2018-09-17 08:27:33 UTC
Created attachment 1483911 [details]
Patch to fix the error messages about non UTF-8 URLs

Comment 6 Fedora Update System 2018-09-18 18:00:22 UTC
recoll-1.23.7-8.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2018-dc579bde5d

Comment 7 Fedora Update System 2018-09-18 18:00:30 UTC
recoll-1.23.7-8.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-597675803c

Comment 8 Fedora Update System 2018-09-18 18:00:37 UTC
recoll-1.23.7-8.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-6baaf41137

Comment 9 Fedora Update System 2018-09-20 04:57:58 UTC
recoll-1.23.7-8.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-6baaf41137

Comment 10 Fedora Update System 2018-09-20 11:10:21 UTC
recoll-1.23.7-8.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-597675803c

Comment 11 Fedora Update System 2018-09-20 16:16:53 UTC
recoll-1.23.7-8.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-dc579bde5d

Comment 12 Fedora Update System 2018-09-28 16:56:45 UTC
recoll-1.23.7-8.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.

Comment 13 Fedora Update System 2018-09-28 17:13:47 UTC
recoll-1.23.7-8.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.

Comment 14 Fedora Update System 2018-09-28 18:20:51 UTC
recoll-1.23.7-8.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.