Bug 1167044

Summary: [abrt] systemd: crash.lto_priv.223(): systemd killed by SIGSEGV
Product: [Fedora] Fedora Reporter: Karel Volný <kvolny>
Component: systemdAssignee: Zbigniew Jędrzejewski-Szmek <zbyszek>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 21CC: bitlord0xff, chkr, cks-rhbugzilla, ismail, johannbg, jsynacek, lnykryn, lslebodn, msekleta, notting, ronald.wahl, s, systemd-maint, vpavlin, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
URL: https://retrace.fedoraproject.org/faf/reports/bthash/d6f6757645adbb563896b3f9819cedc4c4219b1b
Whiteboard: abrt_hash:29e91a4758fc234d9c817ebdd9a747f33d3eab46
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-02 05:11:22 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
File: backtrace
none
File: cgroup
none
File: core_backtrace
none
File: dso_list
none
File: limits
none
File: maps
none
File: open_fds
none
File: proc_pid_status
none
File: var_log_messages
none
gdb dump of the bad 'Unit' structure contents none

Description Karel Volný 2014-11-22 22:25:09 UTC
Description of problem:
I was trying to upgrade F20 to F21 using fedora-upgrade. It has failed after the installation of new packages, and I was unable to even reboot, it said something like it cannot communicate with org.freedesktop.systemd1 ...

Version-Release number of selected component:
systemd-208-28.fc20

Additional info:
reporter:       libreport-2.3.0
backtrace_rating: 4
cmdline:        /usr/lib/systemd/systemd --system --deserialize 24
crash_function: crash.lto_priv.223
environ:        
executable:     /usr/lib/systemd/systemd
kernel:         3.17.3-200.fc20.x86_64
runlevel:       N 5
type:           CCpp
uid:            0

Truncated backtrace:
Thread no. 1 (10 frames)
 #1 crash.lto_priv.223 at ../src/core/main.c:168
 #3 isempty at ../src/shared/util.h:158
 #4 join_path at ../src/shared/cgroup-util.c:455
 #5 cg_get_path at ../src/shared/cgroup-util.c:501
 #6 cg_enumerate_processes.constprop.140 at ../src/shared/cgroup-util.c:51
 #7 cg_is_empty at ../src/shared/cgroup-util.c:887
 #8 cg_is_empty_recursive.constprop.52 at ../src/shared/cgroup-util.c:915
 #9 manager_notify_cgroup_empty at ../src/core/cgroup.c:981
 #10 signal_agent_released.lto_priv.678 at ../src/core/dbus.c:90
 #11 bus_match_run at ../src/libsystemd/sd-bus/bus-match.c:302

Comment 1 Karel Volný 2014-11-22 22:25:13 UTC
Created attachment 960296 [details]
File: backtrace

Comment 2 Karel Volný 2014-11-22 22:25:14 UTC
Created attachment 960297 [details]
File: cgroup

Comment 3 Karel Volný 2014-11-22 22:25:15 UTC
Created attachment 960298 [details]
File: core_backtrace

Comment 4 Karel Volný 2014-11-22 22:25:17 UTC
Created attachment 960299 [details]
File: dso_list

Comment 5 Karel Volný 2014-11-22 22:25:18 UTC
Created attachment 960300 [details]
File: limits

Comment 6 Karel Volný 2014-11-22 22:25:19 UTC
Created attachment 960301 [details]
File: maps

Comment 7 Karel Volný 2014-11-22 22:25:21 UTC
Created attachment 960302 [details]
File: open_fds

Comment 8 Karel Volný 2014-11-22 22:25:22 UTC
Created attachment 960303 [details]
File: proc_pid_status

Comment 9 Karel Volný 2014-11-22 22:25:23 UTC
Created attachment 960304 [details]
File: var_log_messages

Comment 10 Bill Nottingham 2014-11-30 05:32:34 UTC
Also hit a crash on F20 -> F21 upgrade:


Nov 30 00:13:49 nostromo systemd: Unknown serialization item 'subscribed=:1.195'
Nov 30 00:13:49 nostromo systemd: Configuration file /usr/lib/systemd/system/wpa
_supplicant.service is marked executable. Please remove executable permission bi
ts. Proceeding anyway.
Nov 30 00:13:49 nostromo systemd: Unknown serialization item 'subscribed=:1.195'
Nov 30 00:13:49 nostromo systemd: Configuration file /usr/lib/systemd/system/wpa
_supplicant.service is marked executable. Please remove executable permission bi
ts. Proceeding anyway.
Nov 30 00:13:50 nostromo systemd: Failed to reset devices.list on /system.slice:
 Invalid argument
Nov 30 00:13:50 nostromo kernel: [28411.492780] systemd[1]: segfault at 7fb67178
8400 ip 00007fb60dd11d23 sp 00007fffda6c06b0 error 4 in systemd[7fb60dc34000+137
000]
Nov 30 00:13:50 nostromo kernel: systemd[1]: segfault at 7fb671788400 ip 00007fb
60dd11d23 sp 00007fffda6c06b0 error 4 in systemd[7fb60dc34000+137000]
Nov 30 00:13:50 nostromo systemd: Caught <SEGV>, dumped core as pid 9704.

#0  0x00007fb60c757f99 in raise () from /lib64/libpthread.so.0
#1  0x00007fb60dcd7ce8 in crash.lto_priv ()
#2  <signal handler called>
#3  0x00007fb60dd11d23 in signal_agent_released.lto_priv ()
#4  0x00007fb60dc851a5 in bus_match_run ()
#5  0x00007fb60dc84dfe in bus_match_run ()
#6  0x00007fb60dc84dd8 in bus_match_run ()
#7  0x00007fb60dc84dfe in bus_match_run ()
#8  0x00007fb60dc84dd8 in bus_match_run ()
#9  0x00007fb60dc84dfe in bus_match_run ()
#10 0x00007fb60dc84dd8 in bus_match_run ()
#11 0x00007fb60dc84dfe in bus_match_run ()
#12 0x00007fb60dc84dd8 in bus_match_run ()
#13 0x00007fb60dc84dfe in bus_match_run ()
#14 0x00007fb60dc84cee in bus_match_run ()
#15 0x00007fb60dce59d8 in process_match.lto_priv ()
#16 0x00007fb60dd03ae0 in bus_process_internal.constprop ()
#17 0x00007fb60dc66b71 in io_callback ()
#18 0x00007fb60dc6a700 in source_dispatch ()
#19 0x00007fb60dc6b4ae in sd_event_dispatch ()
#20 0x00007fb60dceee80 in manager_loop ()
#21 0x00007fb60dc57239 in main ()
(gdb)

Comment 11 Bill Nottingham 2014-11-30 05:35:15 UTC
(I was on 208-22.fc20)

Comment 12 Chris Siebenmann 2014-12-12 00:21:40 UTC
I have a 64-bit Fedora 20 VM image where this happens completely
consistently and reproducibly during a yum upgrade from Fedora 20 to
Fedora 21. It is using the latest updates, so the starting Fedora 20
version is systemd-208-28.fc20.x86_64 (and I have an intact systemd
journal from the latest upgrade). The yum upgrade is done in the
standard way, with 'yum --releasever 21 distro-sync'.

With either patience or stubbing /usr/bin/systemctl out to /bin/true
so that all of the postinstall 'systemctl daemon-reload' and so on
commands finish any time soon, the upgrade will complete. However for
me the resulting Fedora 21 kernel initramfs stalls during boot; I have
to force a boot off the Fedora 20 kernel and rebuild the initramfs.
This may be unrelated.

Comment 13 Chris Siebenmann 2014-12-12 16:47:25 UTC
I took a look at my systemd core dump. My stack backtrace is the same
as the original reporter, with what is clearly a bogus 'path' pointer.
This pointer comes from a structure element:

#9  0x00007f0523292d13 in manager_notify_cgroup_empty (cgroup=<optimized out>, 
    m=0x7f0524c54350) at ../src/core/cgroup.c:981
981                     r = cg_is_empty_recursive(SYSTEMD_CGROUP_CONTROLLER, u->cgroup_path, true);

I dumped the structure's nominal contents and the raw memory of it in
gdb, which I will put in an attachment because it's relatively large,
but to me it looks like the memory the structure is theoretically at
has been smashed and partially overwritten by, eg, string data.

I would be happy to make the core dump itself available to interested
parties who can determine more from it than I can.

Comment 14 Chris Siebenmann 2014-12-12 16:49:28 UTC
Created attachment 967745 [details]
gdb dump of the bad 'Unit' structure contents

Comment 15 Lukas Slebodnik 2014-12-12 20:06:19 UTC
*** Bug 1130633 has been marked as a duplicate of this bug. ***

Comment 16 Karel Volný 2014-12-15 14:48:57 UTC
(In reply to Chris Siebenmann from comment #13)
> I would be happy to make the core dump itself available to interested
> parties who can determine more from it than I can.

if that can help, my /var/tmp/abrt still holds the data so devs can ping me on irc/in person for a copy/access too

Comment 17 Chris Siebenmann 2015-01-12 19:01:56 UTC
For the record: since Fedora 21 just released an updated systemd, I
just retried a virtual machine upgrade with an up to date Fedora 20
system image and this repeatable systemd crash still happens with
systemd-216-14.fc21.x86_64.

Comment 18 Zbigniew Jędrzejewski-Szmek 2015-02-01 16:22:00 UTC
(In reply to Karel Volný from comment #0)
> and I was unable to even reboot
'systemctl reboot -f' and 'reboot -f' should work even when systemd has segfaulted.
The advantage is that disks are synced.

(In reply to Chris Siebenmann from comment #13)
> I took a look at my systemd core dump. My stack backtrace is the same
> as the original reporter, with what is clearly a bogus 'path' pointer.
> This pointer comes from a structure element:
> 
> #9  0x00007f0523292d13 in manager_notify_cgroup_empty (cgroup=<optimized
> out>, 
>     m=0x7f0524c54350) at ../src/core/cgroup.c:981
> 981                     r = cg_is_empty_recursive(SYSTEMD_CGROUP_CONTROLLER,
> u->cgroup_path, true);
> 
> I dumped the structure's nominal contents and the raw memory of it in
> gdb, which I will put in an attachment because it's relatively large,
> but to me it looks like the memory the structure is theoretically at
> has been smashed and partially overwritten by, eg, string data.
Most likely that unit has been freed, but the pointer to it was retained in
another structure (most probably one of the hashmaps with units). This is
more likely than the memory being overwritten by overflow of another data
structure.

(In reply to Chris Siebenmann from comment #12)
> I have a 64-bit Fedora 20 VM image where this happens completely
> consistently and reproducibly during a yum upgrade from Fedora 20 to
> Fedora 21. It is using the latest updates, so the starting Fedora 20
> version is systemd-208-28.fc20.x86_64 (and I have an intact systemd
> journal from the latest upgrade). The yum upgrade is done in the
> standard way, with 'yum --releasever 21 distro-sync'.
OK, I'll try to reproduce this that way.

Comment 19 Karel Volný 2015-02-02 13:34:22 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #18)
> (In reply to Karel Volný from comment #0)
> > and I was unable to even reboot
> 'systemctl reboot -f' and 'reboot -f' should work even when systemd has
> segfaulted.
> The advantage is that disks are synced.

nice, I haven't noticed it now accepts this option ... and does it have any advantage over the SysRq method?

Comment 20 Zbigniew Jędrzejewski-Szmek 2015-05-20 14:13:34 UTC
(In reply to Karel Volný from comment #19)
> nice, I haven't noticed it now accepts this option ... and does it have any
> advantage over the SysRq method?
It might be easier to use, since with sysrq you have to first sync and remount ro, wait for the kernel to print the message that this is done, and then reboot. It also might be easier over ssh and similar. Finally, sysrq reboot is disabled by default.

Comment 21 Fedora End Of Life 2015-11-04 11:53:42 UTC
This message is a reminder that Fedora 21 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 21. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '21'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 21 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 22 Fedora End Of Life 2015-12-02 05:11:30 UTC
Fedora 21 changed to end-of-life (EOL) status on 2015-12-01. Fedora 21 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.