Bug 1602068 - tracker-extract is spamming logs (because of an invalid UTF-8 character)
Summary: tracker-extract is spamming logs (because of an invalid UTF-8 character)
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: tracker
Version: 29
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Igor Raits
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-17 18:11 UTC by Łukasz Faber
Modified: 2019-11-27 23:30 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-27 23:30:21 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
GNOME Gitlab GNOME/tracker/issues/12 0 None None None 2018-07-27 09:41:10 UTC

Description Łukasz Faber 2018-07-17 18:11:06 UTC
Description of problem:
tracker-extracts produces lots of messages like:
lip 17 19:58:54 essyllt tracker-extract[10972]: Could not insert metadata for item "file:///home/[...].pdf": 57.34: invalid UTF-8 character
lip 17 19:58:54 essyllt tracker-extract[10972]: If the error above is recurrent for the same item/ID, consider running "tracker-extract" in the terminal with the TRACKER_VERBOSITY=3 environment variable, and filing a bug with the additional information

Version-Release number of selected component (if applicable):
tracker.x86_64                             2.0.4-1.fc28                @updates 
tracker-miners.x86_64                      2.0.5-1.fc28                @updates 

How reproducible:
Happens all the time with a specific file.

Steps to Reproduce:
Each time with a specific file

Actual results:
Lots of unwanted logs.

Expected results:
No logs of this type (or proper handling of this file).

Additional info:

$ tracker extract --verbosity=debug "file:///home/[...].pdf"
Tracker-Message: 19:57:42.757: Starting tracker-extract 2.0.5
Tracker-Message: 19:57:42.757: General options:
Tracker-Message: 19:57:42.757:   Verbosity  ............................  3
Tracker-Message: 19:57:42.757:   Sched Idle  ...........................  1
Tracker-Message: 19:57:42.757:   Max bytes (per file)  .................  1048576
Setting scheduler policy to SCHED_IDLE
Setting priority nice level to 19
Loading extractor rules... (/usr/share/tracker-miners/extract-rules)
  Loaded rule '10-abw.rule'
  Loaded rule '10-bmp.rule'
  Loaded rule '10-comics.rule'
  Loaded rule '10-dvi.rule'
  Loaded rule '10-ebooks.rule'
  Loaded rule '10-epub.rule'
  Loaded rule '10-flac.rule'
  Loaded rule '10-gif.rule'
  Loaded rule '10-html.rule'
  Loaded rule '10-ico.rule'
  Loaded rule '10-jpeg.rule'
  Loaded rule '10-msoffice.rule'
  Loaded rule '10-oasis.rule'
  Loaded rule '10-pdf.rule'
  Loaded rule '10-png.rule'
  Loaded rule '10-ps.rule'
  Loaded rule '10-raw.rule'
  Loaded rule '10-svg.rule'
  Loaded rule '10-tiff.rule'
  Loaded rule '10-vorbis.rule'
  Loaded rule '10-xmp.rule'
  Loaded rule '10-xps.rule'
  Loaded rule '11-iso.rule'
  Loaded rule '11-msoffice-xml.rule'
  Loaded rule '15-gstreamer-guess.rule'
  Loaded rule '15-playlist.rule'
  Loaded rule '15-source-code.rule'
  Loaded rule '90-gstreamer-audio-generic.rule'
  Loaded rule '90-gstreamer-image-generic.rule'
  Loaded rule '90-gstreamer-video-generic.rule'
  Loaded rule '90-text-generic.rule'
Extractor rules loaded
MIME type guessed as 'application/pdf' (from GIO)
Using /usr/lib64/tracker-miners-2.0/extract-modules/libextract-pdf.so...
Extracted 3128 bytes from page 0, 1045448 bytes remaining
Extracted 3380 bytes from page 1, 1042068 bytes remaining
Extracted 3509 bytes from page 2, 1038559 bytes remaining
Extracted 3464 bytes from page 3, 1035095 bytes remaining
Extracted 3657 bytes from page 4, 1031438 bytes remaining
Extracted 3371 bytes from page 5, 1028067 bytes remaining
Extracted 2064 bytes from page 6, 1026003 bytes remaining
Extracted 3162 bytes from page 7, 1022841 bytes remaining
Extracted 3657 bytes from page 8, 1019184 bytes remaining
Extracted 1939 bytes from page 9, 1017245 bytes remaining
Extracted 2734 bytes from page 10, 1014511 bytes remaining
Extracted 2100 bytes from page 11, 1012411 bytes remaining
Extracted 3508 bytes from page 12, 1008903 bytes remaining
Extracted 3872 bytes from page 13, 1005031 bytes remaining
Extracted 3668 bytes from page 14, 1001363 bytes remaining
Extracted 3893 bytes from page 15, 997470 bytes remaining
Extracted 3834 bytes from page 16, 993636 bytes remaining
Extracted 2962 bytes from page 17, 990674 bytes remaining
Extracted 1918 bytes from page 18, 988756 bytes remaining
Extracted 2909 bytes from page 19, 985847 bytes remaining
Extracted 2138 bytes from page 20, 983709 bytes remaining
Extracted 1745 bytes from page 21, 981964 bytes remaining
Extracted 4141 bytes from page 22, 977823 bytes remaining
Extracted 2601 bytes from page 23, 975222 bytes remaining
Extracted 5839 bytes from page 24, 969383 bytes remaining
Extracted 4079 bytes from page 25, 965304 bytes remaining
Content extraction finished: 26/26 pages indexed in 0,21 seconds, 83272 bytes extracted

Comment 1 NickPGSmith@gmail.com 2018-07-20 17:37:05 UTC
To give an idea of volume, on my system I have almost 22000 entries in the last 18 hours

Comment 2 Jonathan Briggs 2018-07-26 21:14:07 UTC
I am getting this invalid UTF-8 error as well. Tracker repeatedly tries to insert this invalid data. Just replace it with the Unicode Replacement Character, that's what it's for.

In my case it is a PNG image file with some kind of binary junk in the comment field.

Comment 3 Ben Cotton 2019-05-02 21:11:44 UTC
This message is a reminder that Fedora 28 is nearing its end of life.
On 2019-May-28 Fedora will stop maintaining and issuing updates for
Fedora 28. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora 'version' of '28'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 28 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 4 Ben Cotton 2019-05-28 22:03:41 UTC
Fedora 28 changed to end-of-life (EOL) status on 2019-05-28. Fedora 28 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 5 Jim Scarborough 2019-10-01 13:16:09 UTC
This persists in Fedora 29.

Comment 6 Ben Cotton 2019-10-31 18:47:42 UTC
This message is a reminder that Fedora 29 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '29'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 29 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 7 Ben Cotton 2019-11-27 23:30:21 UTC
Fedora 29 changed to end-of-life (EOL) status on 2019-11-26. Fedora 29 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.