Description of problem: tracker-extracts produces lots of messages like: lip 17 19:58:54 essyllt tracker-extract[10972]: Could not insert metadata for item "file:///home/[...].pdf": 57.34: invalid UTF-8 character lip 17 19:58:54 essyllt tracker-extract[10972]: If the error above is recurrent for the same item/ID, consider running "tracker-extract" in the terminal with the TRACKER_VERBOSITY=3 environment variable, and filing a bug with the additional information Version-Release number of selected component (if applicable): tracker.x86_64 2.0.4-1.fc28 @updates tracker-miners.x86_64 2.0.5-1.fc28 @updates How reproducible: Happens all the time with a specific file. Steps to Reproduce: Each time with a specific file Actual results: Lots of unwanted logs. Expected results: No logs of this type (or proper handling of this file). Additional info: $ tracker extract --verbosity=debug "file:///home/[...].pdf" Tracker-Message: 19:57:42.757: Starting tracker-extract 2.0.5 Tracker-Message: 19:57:42.757: General options: Tracker-Message: 19:57:42.757: Verbosity ............................ 3 Tracker-Message: 19:57:42.757: Sched Idle ........................... 1 Tracker-Message: 19:57:42.757: Max bytes (per file) ................. 1048576 Setting scheduler policy to SCHED_IDLE Setting priority nice level to 19 Loading extractor rules... (/usr/share/tracker-miners/extract-rules) Loaded rule '10-abw.rule' Loaded rule '10-bmp.rule' Loaded rule '10-comics.rule' Loaded rule '10-dvi.rule' Loaded rule '10-ebooks.rule' Loaded rule '10-epub.rule' Loaded rule '10-flac.rule' Loaded rule '10-gif.rule' Loaded rule '10-html.rule' Loaded rule '10-ico.rule' Loaded rule '10-jpeg.rule' Loaded rule '10-msoffice.rule' Loaded rule '10-oasis.rule' Loaded rule '10-pdf.rule' Loaded rule '10-png.rule' Loaded rule '10-ps.rule' Loaded rule '10-raw.rule' Loaded rule '10-svg.rule' Loaded rule '10-tiff.rule' Loaded rule '10-vorbis.rule' Loaded rule '10-xmp.rule' Loaded rule '10-xps.rule' Loaded rule '11-iso.rule' Loaded rule '11-msoffice-xml.rule' Loaded rule '15-gstreamer-guess.rule' Loaded rule '15-playlist.rule' Loaded rule '15-source-code.rule' Loaded rule '90-gstreamer-audio-generic.rule' Loaded rule '90-gstreamer-image-generic.rule' Loaded rule '90-gstreamer-video-generic.rule' Loaded rule '90-text-generic.rule' Extractor rules loaded MIME type guessed as 'application/pdf' (from GIO) Using /usr/lib64/tracker-miners-2.0/extract-modules/libextract-pdf.so... Extracted 3128 bytes from page 0, 1045448 bytes remaining Extracted 3380 bytes from page 1, 1042068 bytes remaining Extracted 3509 bytes from page 2, 1038559 bytes remaining Extracted 3464 bytes from page 3, 1035095 bytes remaining Extracted 3657 bytes from page 4, 1031438 bytes remaining Extracted 3371 bytes from page 5, 1028067 bytes remaining Extracted 2064 bytes from page 6, 1026003 bytes remaining Extracted 3162 bytes from page 7, 1022841 bytes remaining Extracted 3657 bytes from page 8, 1019184 bytes remaining Extracted 1939 bytes from page 9, 1017245 bytes remaining Extracted 2734 bytes from page 10, 1014511 bytes remaining Extracted 2100 bytes from page 11, 1012411 bytes remaining Extracted 3508 bytes from page 12, 1008903 bytes remaining Extracted 3872 bytes from page 13, 1005031 bytes remaining Extracted 3668 bytes from page 14, 1001363 bytes remaining Extracted 3893 bytes from page 15, 997470 bytes remaining Extracted 3834 bytes from page 16, 993636 bytes remaining Extracted 2962 bytes from page 17, 990674 bytes remaining Extracted 1918 bytes from page 18, 988756 bytes remaining Extracted 2909 bytes from page 19, 985847 bytes remaining Extracted 2138 bytes from page 20, 983709 bytes remaining Extracted 1745 bytes from page 21, 981964 bytes remaining Extracted 4141 bytes from page 22, 977823 bytes remaining Extracted 2601 bytes from page 23, 975222 bytes remaining Extracted 5839 bytes from page 24, 969383 bytes remaining Extracted 4079 bytes from page 25, 965304 bytes remaining Content extraction finished: 26/26 pages indexed in 0,21 seconds, 83272 bytes extracted
To give an idea of volume, on my system I have almost 22000 entries in the last 18 hours
I am getting this invalid UTF-8 error as well. Tracker repeatedly tries to insert this invalid data. Just replace it with the Unicode Replacement Character, that's what it's for. In my case it is a PNG image file with some kind of binary junk in the comment field.
This message is a reminder that Fedora 28 is nearing its end of life. On 2019-May-28 Fedora will stop maintaining and issuing updates for Fedora 28. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '28'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 28 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 28 changed to end-of-life (EOL) status on 2019-05-28. Fedora 28 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.
This persists in Fedora 29.
This message is a reminder that Fedora 29 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '29'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 29 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 29 changed to end-of-life (EOL) status on 2019-11-26. Fedora 29 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.