Bug 1494618

Summary: systemd-journal coredump
Product: [Fedora] Fedora Reporter: Gleidson Baleeiro <gleidson>
Component: systemdAssignee: systemd-maint
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 27CC: harald, jsynacek, kay, lnykryn, lpoetter, mschmidt, msekleta, ssahani, s, systemd-maint, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-09-27 13:25:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
coredump created when the error happened
none
journalctl when the error happened none

Description Gleidson Baleeiro 2017-09-22 16:22:30 UTC
Created attachment 1329665 [details]
coredump created when the error happened

After update my system to last fedora packages i get the error message below on dmesg:

[  885.379271] kworker/dying (5) used greatest stack depth: 10480 bytes left
[ 2987.521076] perf: interrupt took too long (2551 > 2500), lowering kernel.perf_event_max_sample_rate to 78000
[ 3294.769338] systemd-coredump[9295]: MESSAGE=Process 1037 (systemd-journal) of user 0 dumped core.
[ 3294.769529] systemd-coredump[9295]: Coredump diverted to /var/lib/systemd/coredump/core.systemd-journal.0.9de426f6a0014eab83f55cee14873efa.1037.1506049627000000.lz4
[ 3294.769546] systemd-coredump[9295]: Stack trace of thread 1037:
[ 3294.769561] systemd-coredump[9295]: #0  0x00007ff4d9be2b6d pthread_join (libpthread.so.0)
[ 3294.769575] systemd-coredump[9295]: #1  0x00007ff4d94e6b8e journal_file_set_offline_thread_join (libsystemd-shared-234.so)
[ 3294.769589] systemd-coredump[9295]: #2  0x00007ff4d94e6c5d journal_file_set_online.lto_priv.110 (libsystemd-shared-234.so)
[ 3294.769602] systemd-coredump[9295]: #3  0x00007ff4d94e7836 journal_file_append_object (libsystemd-shared-234.so)
[ 3294.769646] systemd-coredump[9295]: #4  0x00007ff4d94eb48b journal_file_append_data.lto_priv.107 (libsystemd-shared-234.so)
[ 3294.769659] systemd-coredump[9295]: #5  0x00007ff4d94704c1 journal_file_append_entry (libsystemd-shared-234.so)
[ 3294.769690] systemd-coredump[9295]: #6  0x0000559a142f060a dispatch_message_real (systemd-journald)
[ 3294.814184] systemd-coredum: 21 output lines suppressed due to ratelimiting
[ 3295.172814] systemd-journald[9329]: File /var/log/journal/20d906ca62a840e5b786f672ce9c22d3/system.journal corrupted or uncleanly shut down, renaming and replacing.


Packages:

kernel-4.13.3-300.fc27.x86_64
systemd-container-234-5.fc27.x86_64
systemd-devel-234-5.fc27.x86_64
systemd-debuginfo-233-6.fc26.x86_64
systemd-udev-234-5.fc27.x86_64
systemd-pam-234-5.fc27.x86_64
systemd-bootchart-232-1.fc27.x86_64
systemd-libs-234-5.fc27.x86_64
systemd-234-5.fc27.x86_64

Comment 1 Michal Schmidt 2017-09-25 13:14:20 UTC
The main thread was waiting on a thread doing fsync() to finish.
Probably it dumped core due to being sent SIGABRT by the systemd watchdog timer.
Please check if the event caused anything to be logged in the journal (with journalctl), not just dmesg.

Comment 2 Gleidson Baleeiro 2017-09-25 22:03:08 UTC
Created attachment 1330741 [details]
journalctl when the error happened

This error ocurred only this day

Comment 3 Michal Schmidt 2017-09-27 13:25:47 UTC
It was the watchdog:

Sep 22 00:06:17 inspiron7000 systemd[1]: systemd-journald.service: Watchdog timeout (limit 3min)!
Sep 22 00:06:17 inspiron7000 systemd[1]: systemd-journald.service: Killing process 1037 (systemd-journal) with signal SIGABRT.

Also bumblebeed.service was periodically restarting around the time. I don't know if it's in any way related.

It would be interesting if you found a reproducible way to trigger this, but since you say it occurred only once, I'm going to close this BZ as yet another case of systemd-journald killed by the watchdog.

*** This bug has been marked as a duplicate of bug 1300212 ***