Bug 2043495

Summary: abrt-xorg.service fails to start with error Cannot read journal data
Product: Red Hat Enterprise Linux 8 Reporter: Pavol Babinčák <pbabinca>
Component: abrtAssignee: abrt <abrt-devel-list>
Status: CLOSED WONTFIX QA Contact: CS System Management SST QE <rhel-cs-system-management-subsystem-qe>
Severity: low Docs Contact:
Priority: unspecified    
Version: 8.5CC: mfabik, msrb, ovasik
Target Milestone: rcKeywords: Triaged
Target Release: ---Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-06-30 08:17:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pavol Babinčák 2022-01-21 11:10:26 UTC
Description of problem:
After some time abrt-dump-journal-xorg fails to start due to error Cannot read journal data.

Version-Release number of selected component (if applicable):
2.10.9-21.el8

How reproducible:
Always, possibly after corruption of journal.

Steps to Reproduce:
1. (re)start a service: sudo systemctl restart abrt-xorg.service
2. check if the service is running: systemctl status abrt-xorg.service

Actual results:
Service is "failed". In my case:

$ systemctl status abrt-xorg.service
● abrt-xorg.service - ABRT Xorg log watcher
   Loaded: loaded (/usr/lib/systemd/system/abrt-xorg.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Thu 2022-01-20 10:07:16 CET; 1 day 1h ago
 Main PID: 1719 (code=exited, status=1/FAILURE)

Jan 20 10:07:15 yumba systemd[1]: Started ABRT Xorg log watcher.
Jan 20 10:07:16 yumba abrt-dump-journal-xorg[1719]: abrt-dump-journal-xorg: Cannot read journal data.
Jan 20 10:07:16 yumba systemd[1]: abrt-xorg.service: Main process exited, code=exited, status=1/FAILURE
Jan 20 10:07:16 yumba systemd[1]: abrt-xorg.service: Failed with result 'exit-code'.


Expected results:
Service is "running", ignoring bad journal entries.

Additional info:

Getting ExecStart from /usr/lib/systemd/system/abrt-xorg.service with additional verbosity:

$ sudo /usr/bin/abrt-dump-journal-xorg -fxtD -vvv                                  
abrt-dump-journal-xorg: Loaded option 'MaxCrashReportsSize' = '5000'
abrt-dump-journal-xorg: Loaded option 'DeleteUploaded' = 'no'
abrt-dump-journal-xorg: Loaded option 'AutoreportingEvent' = 'report_uReport'
abrt-dump-journal-xorg: Loaded option 'AutoreportingEnabled' = 'no'
abrt-dump-journal-xorg: Loaded option 'MaxCrashReportsSize' = '5000'
abrt-dump-journal-xorg: Loaded option 'DeleteUploaded' = 'no'
abrt-dump-journal-xorg: Loaded option 'AutoreportingEvent' = 'report_uReport'
abrt-dump-journal-xorg: Loaded option 'AutoreportingEnabled' = 'no'
abrt-dump-journal-xorg: Loading settings from 'xorg.conf'
abrt-dump-journal-xorg: Loaded option 'BlacklistedXorgModules' = 'nvidia, fglrx, vboxvideo'
abrt-dump-journal-xorg: Loaded option 'JournalFilters' = '_COMM=gdm-x-session, _COMM=gnome-shell'
abrt-dump-journal-xorg: Loaded option 'BlacklistedXorgModules' = 'nvidia, fglrx, vboxvideo'
abrt-dump-journal-xorg: Loaded option 'JournalFilters' = '_COMM=gdm-x-session, _COMM=gnome-shell'
abrt-dump-journal-xorg: Loaded 'xorg.conf'
abrt-dump-journal-xorg: Using journal filter from conf file xorg.conf
abrt-dump-journal-xorg: Using journal match: '_COMM=gdm-x-session'
abrt-dump-journal-xorg: Using journal match: '_COMM=gnome-shell'
abrt-dump-journal-xorg: Failed to read 'MESSAGE' field: Bad message
abrt-dump-journal-xorg: Cannot read journal data.


Running it through ltrace I see last entries as:

strncpy(0x7ffe0746f2e0, "[308042:308075:1119/053235.521890:ERROR:connection_factory_impl.cc(429)] Failed to connect to MCS endpoint with error -106", 122) = 0x7ffe0746f2e0
strstr("[308042:308075:1119/053235.521890:ERROR:connection_factory_impl.cc(429)] Failed to connect to MCS endpoint with error -106", "Backtrace:") = nil
sd_journal_next(0x55b8f082eb50, 0x55b8eefe0ddc, 0x18000000000, 0x7ffe0746f35a)                                                    = 1
sd_journal_get_data(0x55b8f082eb50, 0x55b8eefe165e, 0x7ffe0746f2b0, 0x7ffe0746f2a8)                                               = 0xffffffb6
strerror(74)                                                                                                                      = "Bad message"


With command:

$ sudo journalctl -u abrt-xorg.service

I found that error appeared for the first time on November 19:

...

-- Reboot --
Nov 09 12:19:36 yumba systemd[1]: Started ABRT Xorg log watcher.
-- Reboot --
Nov 19 10:48:13 yumba systemd[1]: Started ABRT Xorg log watcher.
Nov 19 10:48:14 yumba abrt-dump-journal-xorg[1645]: abrt-dump-journal-xorg: Cannot read journal data.
Nov 19 10:48:14 yumba systemd[1]: abrt-xorg.service: Main process exited, code=exited, status=1/FAILURE
Nov 19 10:48:14 yumba systemd[1]: abrt-xorg.service: Failed with result 'exit-code'.

....

And I was able to narrow down actual journal file that is corrupted:


$ journalctl --verify --file /var/log/journal/be37cfed3f434d0cb8d83feafdfaa26b/user-18031~     
1314e88: Invalid entry item (20/25 offset: 000000                                                                                                                  
1314e88: Invalid object contents: Bad message                                                                                                                      
File corruption detected at /var/log/journal/be37cfed3f434d0cb8d83feafdfaa26b/user-18031~:1314e88 (of 25165824 bytes, 79%).
FAIL: /var/log/journal/be37cfed3f434d0cb8d83feafdfaa26b/user-18031~ (Bad message)

---

From what I understand in https://bugs.freedesktop.org/show_bug.cgi?id=64116 journal files which have been identified as corrupted are rotated and there is no plan to get those fixed nor there is a recommendation to remove them ("? Why would you want to throw-away the good parts in the journal files?").