Bug 1368714 - /usr/lib/systemd/systemd-journald crasched on writing firefox core file
Summary: /usr/lib/systemd/systemd-journald crasched on writing firefox core file
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-20 17:23 UTC by Tomasz Kłoczko
Modified: 2016-08-26 09:36 UTC (History)
8 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2016-08-26 09:30:37 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
core.systemd-journal.0.1e8af79~13ac3.592.1471625154000000.lz4 (278.59 KB, application/octet-stream)
2016-08-20 17:25 UTC, Tomasz Kłoczko
no flags Details

Description Tomasz Kłoczko 2016-08-20 17:23:31 UTC
On my system firefox crashed. I've started investigating and I found that core file was corrupted. After review what really happened I found that /usr/lib/systemd/systemd-journald crashed on priocessing core file.

root@domek coredump]# gdb -c core.systemd-journal.0.1e8af79af2854019bba8d91966113ac3.592.1471625154000000GNU gdb (GDB) Fedora 7.11.90.20160807-5.fc26
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
[New LWP 592]

warning: core file may not match specified executable file.
Reading symbols from /usr/lib/systemd/systemd-journald...Reading symbols from /usr/lib/debug/usr/lib/systemd/systemd-journald.debug...done.
done.

warning: Ignoring non-absolute filename: <linux-vdso.so.1>
Missing separate debuginfo for linux-vdso.so.1
Try: dnf --enablerepo='*debug*' install /usr/lib/debug/.build-id/c0/412ca6c433a2c7c419ba2ab85f056e6b4d68ec
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/lib/systemd/systemd-journald'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f6433cd2a15 in link_entry_into_array (f=f@entry=0x5567f9e28c20, first=0x7f64263cce58, idx=idx@entry=0x7fffb36286e0, p=p@entry=25997608)
    at src/journal/journal-file.c:1463
1463	                        o->entry_array.items[i] = htole64(p);
(gdb) bt
#0  0x00007f6433cd2a15 in link_entry_into_array (f=f@entry=0x5567f9e28c20, first=0x7f64263cce58, idx=idx@entry=0x7fffb36286e0, p=p@entry=25997608)
    at src/journal/journal-file.c:1463
#1  0x00007f6433cd2fce in link_entry_into_array_plus_one (p=25997608, idx=0x7f64263cce60, first=0x7f64263cce58, extra=0x7f64263cce50, f=0x5567f9e28c20)
    at src/journal/journal-file.c:1533
#2  journal_file_link_entry_item (i=5, offset=25997608, o=<optimized out>, f=0x5567f9e28c20) at src/journal/journal-file.c:1557
#3  journal_file_link_entry (offset=25997608, o=0x7f642defa128, f=0x5567f9e28c20) at src/journal/journal-file.c:1599
#4  journal_file_append_entry_internal.lto_priv.51 (f=f@entry=0x5567f9e28c20, ts=ts@entry=0x7fffb3628850, xor_hash=xor_hash@entry=13520177398950177281, 
    items=items@entry=0x7fffb3628740, n_items=n_items@entry=10, seqnum=seqnum@entry=0x7fffb362b8e0, ret=0x0, offset=0x0) at src/journal/journal-file.c:1643
#5  0x00007f6433cd38d5 in journal_file_append_entry (f=0x5567f9e28c20, ts=0x7fffb3628850, iovec=<optimized out>, n_iovec=10, seqnum=0x7fffb362b8e0, ret=0x0, offset=0x0)
    at src/journal/journal-file.c:1799
#6  0x00005567f95fa2ca in write_to_journal (priority=39, n=10, iovec=0x7fffb3628d30, uid=0, s=0x7fffb362b830) at src/journal/journald-server.c:551
#7  dispatch_message_real.lto_priv.14 (s=0x7fffb362b830, iovec=0x7fffb3628d30, n=10, m=125, ucred=<optimized out>, tv=0x0, label=0x0, label_len=0, unit_id=0x0, 
    priority=39, object_pid=0) at src/journal/journald-server.c:858
#8  0x00005567f9600d28 in server_dispatch_message (s=<optimized out>, iovec=0x7fffb3628d30, n=<optimized out>, m=<optimized out>, ucred=0x0, tv=<optimized out>, 
    label=<optimized out>, label_len=<optimized out>, unit_id=<optimized out>, priority=<optimized out>, object_pid=<optimized out>)
    at src/journal/journald-server.c:979
#9  0x00005567f96024e5 in dev_kmsg_record (l=<optimized out>, p=<optimized out>, s=<optimized out>) at src/journal/journald-kmsg.c:312
#10 server_read_dev_kmsg.lto_priv.17 (s=<optimized out>, s=<optimized out>) at src/journal/journald-kmsg.c:353
#11 0x00007f6433c74883 in source_dispatch (s=s@entry=0x5567f9dfc710) at src/libsystemd/sd-event/sd-event.c:2267
#12 0x00007f6433c74a64 in sd_event_dispatch (e=e@entry=0x5567f9dfc170) at src/libsystemd/sd-event/sd-event.c:2626
#13 0x00007f6433c75f97 in sd_event_run (e=0x5567f9dfc170, timeout=18446744073709551615) at src/libsystemd/sd-event/sd-event.c:2685
#14 0x00005567f95f660a in main (argc=<optimized out>, argv=<optimized out>) at src/journal/journald.c:101
(gdb) 

[root@domek coredump]# rpm -qf /usr/lib/systemd/systemd-journald
systemd-231-3.fc26.x86_64

Comment 1 Tomasz Kłoczko 2016-08-20 17:25:03 UTC
Created attachment 1192482 [details]
core.systemd-journal.0.1e8af79~13ac3.592.1471625154000000.lz4

Comment 2 Zbigniew Jędrzejewski-Szmek 2016-08-26 09:04:07 UTC
It probably didn't crash per se, but was terminated by systemd because it missed the watchdog ping. But it seems that it's just writing a short message, so it shouldn't do that. Was the machine under heady load? Do you have an ssd or rotation drive? Can you provide the logs from around the crash?

Comment 3 Tomasz Kłoczko 2016-08-26 09:12:20 UTC
I had another firefox crash in meantime.
Strange but after uncompressing core*.lz4 I found only 2GB file and of course by this it was corrupted.
If it any limit of max core file size? maybe it is limited by max file size which is able compess lz4?

$ sudo ulimit -c
unlimited

Nevertheless as you see in case of the crash which I've reported systemd-journal generated own core file.

Comment 4 Zbigniew Jędrzejewski-Szmek 2016-08-26 09:22:04 UTC
There's a default limit of 2GB for core dumps. The file is not corrupted, just truncated. GDB should be able to extract useful information from this anyway.

Comment 5 Zbigniew Jędrzejewski-Szmek 2016-08-26 09:30:37 UTC
I don't think this abort is directly related to the firefox crash. I think the system was overloaded because of memory and cpu contention as an effect of the firefox crash and processing of the core dump or something.

FTR, journald was trying to write the following:
(gdb) p (char*)iovec[0].iov_base
$13 = 0x5567f9e060f0 "_SOURCE_MONOTONIC_TIMESTAMP=154852337252"
(gdb) p (char*)iovec[1].iov_base
$14 = 0x5567f960780e "_TRANSPORT=kernel"
(gdb) p (char*)iovec[2].iov_base
$15 = 0x5567f9e28bb0 "PRIORITY=7"
(gdb) p (char*)iovec[3].iov_base
$16 = 0x5567f9e292b0 "SYSLOG_FACILITY=4"
(gdb) p (char*)iovec[4].iov_base
$17 = 0x5567f9e403c0 "SYSLOG_IDENTIFIER=systemd-logind"
(gdb) p (char*)iovec[5].iov_base
$18 = 0x5567f9e1be80 "SYSLOG_PID=912"
(gdb) p (char*)iovec[6].iov_base
$19 = 0x5567f9e0a340 "MESSAGE=Got message type=signal sender=:1.0 destination=n/a object=/org/freedesktop/systemd1/unit/httpd_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=35388 reply_"...
(gdb) p (char*)iovec[7].iov_base
$20 = 0x7fffb362ba56 "_BOOT_ID=1e8af79af2854019bba8d91966113ac3"
(gdb) p (char*)iovec[8].iov_base
$21 = 0x7fffb362ba29 "_MACHINE_ID=02dfc25c60b54e85bc6096668e58e42c"
(gdb) p (char*)iovec[9].iov_base
$22 = 0x5567f9dfd850 "_HOSTNAME=domek"

*** This bug has been marked as a duplicate of bug 1300212 ***

Comment 6 Tomasz Kłoczko 2016-08-26 09:32:58 UTC
gdb been reporting that core file was corrupted and it was not able to show registers or stack back trace content.

I think that default 2GB limit should be increased.

Comment 7 Zbigniew Jędrzejewski-Szmek 2016-08-26 09:36:50 UTC
Yeah, IIRC, we already increased it upstream.


Note You need to log in before you can comment on or make changes to this bug.