Bug 724928 - Reboot ends with kernel panic on systemd abort()
Reboot ends with kernel panic on systemd abort()
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: systemd (Show other bugs)
rawhide
All Linux
unspecified Severity unspecified
: ---
: ---
Assigned To: Lennart Poettering
Fedora Extras Quality Assurance
AcceptedBlocker
:
: 725158 725253 725999 726462 (view as bug list)
Depends On:
Blocks: F16Alpha/F16AlphaBlocker
  Show dependency treegraph
 
Reported: 2011-07-22 06:44 EDT by Zdenek Kabelac
Modified: 2011-08-05 16:41 EDT (History)
15 users (show)

See Also:
Fixed In Version: systemd-31-2.fc16
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-08-05 14:36:44 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
shutdown panic from 3.0.0-1.fc16 kernel (1.02 KB, text/plain)
2011-07-24 17:45 EDT, Michal Jaegermann
no flags Details

  None (edit)
Description Zdenek Kabelac 2011-07-22 06:44:28 EDT
Description of problem:

I'm not particularly sure where is the problem - but as the systemd is the
first one crashing - I'm reporting it as systemd bug:

Final messages before kernel panic:

Detaching loop devices.
Detaching DM devices.
Successfully changed into root pivot.
Assertion 'close_nointr(fd) == 0' failed at src/util.c:274, function close_nointr_nofail(). Aborting().
systemd-shutdow[1] general protection ip:7f2138be2b77 sp:7fff4d21a940 error:0 in libc-2.14.90.so[7f2138bab000+19e000]
Kernel panic - not syncing: Attempted to kill init!
Pid: 1, comm: systemd-shutdow Not tainted 3.0.0-rc7-00186-gc9a28a5 #5
Call Trace:
 [<ffffffff814948c8>] panic+0x9b/0x1a7
 [<ffffffff810549bb>] ? do_exit+0x81b/0x950
 [<ffffffff81054a5a>] do_exit+0x8ba/0x950
 [<ffffffff81054e4f>] do_group_exit+0x4f/0xc0
 [<ffffffff81067b8e>] get_signal_to_deliver+0x3be/0x680
 [<ffffffff8100216f>] do_signal+0x6f/0x7a0
 [<ffffffff81065030>] ? check_kill_permission+0x240/0x240
 [<ffffffff814a3b09>] ? sub_preempt_count+0xa9/0xe0
 [<ffffffff814a03b2>] ? _raw_spin_unlock_irqrestore+0x42/0x80
 [<ffffffff810666e3>] ? force_sig_info+0xe3/0x100
 [<ffffffff81002925>] do_notify_resume+0x65/0x80
 [<ffffffff8129919e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff814a0aaa>] retint_signal+0x46/0x8c


This happens with upstream kernel - master commit id:
cf6ace16a3cd8b728fb0afa68368fd40bbeae19f
(just some nearly final 3.0) kernel.

Is it good idea to abort()  process with pid 1 ??

Version-Release number of selected component (if applicable):
systemd-30-1.fc16.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 2 Tim Flink 2011-07-22 18:11:34 EDT
Discussed at the 2011-07-22 alpha blocker bug review meeting. Tentatively accepted as a Fedora 16 alpha blocker assuming that a proposal to modify the alpha release criteria is accepted.

The installer must be able to complete an installation using the entire disk, existing free space, or existing Linux partitions methods, with or without encryption or LVM enabled.
Comment 3 Tim Flink 2011-07-22 18:12:46 EDT
(In reply to comment #2)
> Discussed at the 2011-07-22 alpha blocker bug review meeting. Tentatively
> accepted as a Fedora 16 alpha blocker assuming that a proposal to modify the
> alpha release criteria is accepted.
> 
> The installer must be able to complete an installation using the entire disk,
> existing free space, or existing Linux partitions methods, with or without
> encryption or LVM enabled.

I put the wrong proposed criterion in the previous comment. It was supposed to read:

The systems' mechanisms for shutting down, logging out and rebooting from a virtual console must work.
Comment 4 Michal Schmidt 2011-07-24 16:49:04 EDT
*** Bug 725158 has been marked as a duplicate of this bug. ***
Comment 5 Michal Schmidt 2011-07-24 16:51:39 EDT
*** Bug 725253 has been marked as a duplicate of this bug. ***
Comment 6 Michal Jaegermann 2011-07-24 17:45:03 EDT
Created attachment 514935 [details]
shutdown panic from 3.0.0-1.fc16 kernel

Hm, got hit by the same but with kernel-3.0.0-1.fc16.  Call trace is somewhat diffrent but pretty similar.  Attached just in case. For one reason or another systemd is not crashing if I am rebooting with older kernels.
Comment 7 Harald Hoyer 2011-07-27 06:43:04 EDT
*** Bug 725999 has been marked as a duplicate of this bug. ***
Comment 8 Tim Flink 2011-07-28 17:39:05 EDT
The proposed criterion under which this was accepted as a blocker has changed to:

It must be possible to trigger a system shutdown using standard console commands, and the system must shut down in such a way that storage volumes (e.g. simple partitions, LVs and PVs, RAID arrays) are taken offline safely and the system's BIOS or EFI is correctly requested to power down the system.

Since the criterion has changed, moving back to proposed blocker from accepted.

Does this affect shutdown as well or is this just an issue with reboot? If I'm reading this right, it sounds like the systemd crash happens after storage devices have been taken offline safely.

Can someone confirm this?
Comment 9 Clyde E. Kunkel 2011-07-28 19:14:15 EDT
Shutdown and reboot affected.  Fixed with systemd-31-2.fc16.
Comment 10 Michal Schmidt 2011-07-29 12:56:42 EDT
*** Bug 726462 has been marked as a duplicate of this bug. ***
Comment 11 Tim Flink 2011-07-29 14:02:36 EDT
Discussed in the 2011-07-29 blocker review meeting. Accepted as F16 alpha blocker due to violation of the following alpha release criterion [1]:

It must be possible to trigger a system shutdown using standard console
commands, and the system must shut down in such a way that storage volumes
(e.g. simple partitions, LVs and PVs, RAID arrays) are taken offline safely and
the system's BIOS or EFI is correctly requested to power down the system.

[1] https://fedoraproject.org/wiki/Fedora_16_Alpha_Release_Criteria
Comment 12 Tim Flink 2011-08-05 14:36:44 EDT
Verified as fixed in systemd-31-2.fc16. If the problem should re-appear, please re-open this bug.
Comment 13 Mads Kiilerich 2011-08-05 14:56:01 EDT
How can a kernel panic be fixed in systemd?

I agree that a workaround in systemd is fine for an alpha release criteria, but is it really acceptable that systemd can cause the kernel to panic? Isn't that the real bug that should be solved?
Comment 14 Zdenek Kabelac 2011-08-05 16:41:28 EDT
(In reply to comment #13)
> How can a kernel panic be fixed in systemd?
> 
> I agree that a workaround in systemd is fine for an alpha release criteria, but
> is it really acceptable that systemd can cause the kernel to panic? Isn't that
> the real bug that should be solved?

Well aborting pid == 1 isn't really good idea, what should kernel do...
The bug is already fixed in upstream systemd package. 
It's been obviously userspace problem.

Note You need to log in before you can comment on or make changes to this bug.