Bug 859958

Summary: Hang at shutdown of F17 upgrade - one DM device not detatching
Product: [Fedora] Fedora Reporter: jgforbes <jgf>
Component: dracutAssignee: dracut-maint
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 17CC: dracut-maint, harald, jonathan, william.garber
Target Milestone: ---Flags: william.garber: needinfo+
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-31 23:59:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Photo of screen output in debug shell
none
debugging logs
none
debugging logs none

Description jgforbes 2012-09-24 13:32:10 UTC
Description of problem:
I did an upgrade from F15 to F17 an now the system hangs on shutdown.
The final output to the console is similar to that in comment #3 of Bug #831634
as copied below:

Sending SIGTERM to remaining processes...
Sending SIGKILL to remaining processes...
Unmounting file systems.
Unmounted /bgoot.
Unmounted /proc/fs/nsfd.
Unmounted /dev/mqueue.
Unmounted /sys/kernel/config.
Unmounted /dev/homepages.
Unmounted /sys/kernel/debug.
Disabling swaps.
Detaching loop devices.
Detaching DM devices.
Not all DM devices detached, 1 left.
Detaching DM devies.
Not all DM devices detached, 1 left.
Cannot finalize remaining file systems and devices, trying to kill remaining processes.
Not all DM devices detached, 1 left.
Unmounted /oldroot/proc.
Unmounted /oldroot/dev/pts.
Unmounted /oldroot/run.
Unmounted /oldroot/sys/fs/selinux.
Unmounted /oldroot/sys/fs/cgrop/systemd.
Unmounted /oldroot/sys/fs/cgroup/memory.
Unmounted /oldroot/sys/fs/cgroup/freezer.
Unmounted /oldroot/sys/fs/cgroup/blkio.
Unmounted /oldroot/dev/shm.
Unmounted /oldroot/sys/kernel/security.
Unmounted /oldroot/sys/fs/cgroup/cpu,cpuacct.
Unmounted /oldroot/dev.
Unmounted /oldroot/sys/fs/cgroup/devices.
Unmounted /oldroot/sys/fs/cgroup/cpuset.
Unmounted /oldroot/sys/fs/cgroup/perf_event.
Unmounted /oldroot/sys/fs/csgroup.
Unmounted /oldroot/sys.
Unmounted /oldroot.

The difference is that it stops there.

When I did the bare metal install of F15 it also had shutdown problems
which were eventually rectified by one of the updates.

I have an Intel based motherboard with an i5 CPU and using the Intel
raid system.

Not sure if this is a problem with dracut as I have the latest version (see below).

Version-Release number of selected component (if applicable):
kernel  3.5.4-1.fc17.x86_64
dracut-018-98.git20120813.fc17.noarch

How reproducible:
Every time at shutdown/poweroff


Additional info:

Comment 1 Harald Hoyer 2012-09-24 13:57:21 UTC
(In reply to comment #0)
> Description of problem:
> I did an upgrade from F15 to F17 an now the system hangs on shutdown.
> The final output to the console is similar to that in comment #3 of Bug
> #831634
> as copied below:
> 
> Sending SIGTERM to remaining processes...
> Sending SIGKILL to remaining processes...
...
> Cannot finalize remaining file systems and devices, trying to kill remaining
> processes.
> Not all DM devices detached, 1 left.

This is no problem.

> Unmounted /oldroot/proc.
> Unmounted /oldroot/dev/pts.
> Unmounted /oldroot/run.
> Unmounted /oldroot/sys/fs/selinux.
> Unmounted /oldroot/sys/fs/cgrop/systemd.
> Unmounted /oldroot/sys/fs/cgroup/memory.
> Unmounted /oldroot/sys/fs/cgroup/freezer.
> Unmounted /oldroot/sys/fs/cgroup/blkio.
> Unmounted /oldroot/dev/shm.
> Unmounted /oldroot/sys/kernel/security.
> Unmounted /oldroot/sys/fs/cgroup/cpu,cpuacct.
> Unmounted /oldroot/dev.
> Unmounted /oldroot/sys/fs/cgroup/devices.
> Unmounted /oldroot/sys/fs/cgroup/cpuset.
> Unmounted /oldroot/sys/fs/cgroup/perf_event.
> Unmounted /oldroot/sys/fs/csgroup.
> Unmounted /oldroot/sys.
> Unmounted /oldroot.
> 
> The difference is that it stops there.

The problem is that it does not poweroff here.

Comment 2 Harald Hoyer 2012-09-24 14:02:51 UTC
Please boot with "rd.debug rd.break=shutdown" on the kernel command line and see, if you can get a shell on shutdown.

Then try:

# reboot -f -d -n --no-wall

or

# poweroff -f -d -n --no-wall

Comment 3 jgforbes 2012-09-25 01:37:10 UTC
Adding "rd.debug rd.break=shutdown"  (with or without the quotes) causes the system to boot into a debug shell.
I assume that this is not the expected behavior.


When booted without the kernel command,
 # reboot -f -d -n --no-wall 
causes the system to reboot without hanging.

Comment 4 Harald Hoyer 2012-09-25 13:11:36 UTC
(In reply to comment #3)
> Adding "rd.debug rd.break=shutdown"  (with or without the quotes) causes the
> system to boot into a debug shell.


> I assume that this is not the expected behavior.

Yes, it is the expected behaviour.

> 
> 
> When booted without the kernel command,
>  # reboot -f -d -n --no-wall 
> causes the system to reboot without hanging.

ah, good... can you boot only with "rd.debug" and take a photo from the last messages on reboot?

Comment 5 jgforbes 2012-09-25 15:00:08 UTC
Created attachment 617075 [details]
Photo of screen output in debug shell

Comment 6 jgforbes 2012-09-25 15:03:44 UTC
Booting with this kernel command line boots into the debug shell as shown in the photo
linux   /vmlinuz-3.5.4-1.fc17.x86_64 root=/dev/mapper/vg_fastdog-lv_root ro LANG=en_US.UTF-8  KEYTABLE=us rd.md.uuid=ba9a08af:dee20819:dc0ae0ad:abeedaf5 rd.lvm.lv=vg_fastdog/lv_root rd.md.uuid=8ece37b6:1e3cb7e2:5ee1030b:41b18ac8 rd.luks=0 rd.lvm.lv=vg_fastdog/lv_swap SYSFONT=latarcyrheb-sun16 rd.dm=0 rhgb quiet rd.debug rd.break=shutdown

Removing the rd.break=shutdown results in booting into the fedora desktop.

Comment 7 Harald Hoyer 2012-09-26 13:15:49 UTC
(In reply to comment #6)
> Booting with this kernel command line boots into the debug shell as shown in
> the photo
> linux   /vmlinuz-3.5.4-1.fc17.x86_64 root=/dev/mapper/vg_fastdog-lv_root ro
> LANG=en_US.UTF-8  KEYTABLE=us rd.md.uuid=ba9a08af:dee20819:dc0ae0ad:abeedaf5
> rd.lvm.lv=vg_fastdog/lv_root rd.md.uuid=8ece37b6:1e3cb7e2:5ee1030b:41b18ac8
> rd.luks=0 rd.lvm.lv=vg_fastdog/lv_swap SYSFONT=latarcyrheb-sun16 rd.dm=0
> rhgb quiet rd.debug rd.break=shutdown
> 
> Removing the rd.break=shutdown results in booting into the fedora desktop.

Yes, you should boot into the fedora desktop and poweroff then. Debug messages should appear on poweroff also. So we can see why it is not powering off.

Comment 8 jgforbes 2012-09-26 14:32:23 UTC
End of shell output

Unmounted /oldroot.
/shutdown@34(): _cnt=6
/shutdown@32(): '[' 6 -le 40 ']'
/shutdown@33: umount_a

stops here.
System is not dead as plugging in a USB device gives a response.
Also, the screen saver is still working as the screen goes blank and can be reactivated.

Looks like this file is where the wait loop is happening is:
/usr/lib/dracut/modules.d/99shutdown/shutdown.sh

Comment 9 Harald Hoyer 2012-09-27 10:10:48 UTC
(In reply to comment #8)
> End of shell output
> 
> Unmounted /oldroot.
> /shutdown@34(): _cnt=6
> /shutdown@32(): '[' 6 -le 40 ']'
> /shutdown@33: umount_a
> 
> stops here.
> System is not dead as plugging in a USB device gives a response.
> Also, the screen saver is still working as the screen goes blank and can be
> reactivated.
> 
> Looks like this file is where the wait loop is happening is:
> /usr/lib/dracut/modules.d/99shutdown/shutdown.sh

This is very strange because you should definitely see more output after that.

Comment 10 jgforbes 2012-09-27 15:51:00 UTC
I edited /usr/lib/dracut/modules.d/99shutdown/shutdown.sh
to echo the device that it is trying to unmount, the $mp variable.
However, it did not print anything. Does this script actually run or is there
code elsewhere for this functionality?

Comment 11 Fedora End Of Life 2013-07-03 22:15:18 UTC
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 12 Fedora End Of Life 2013-07-31 23:59:37 UTC
Fedora 17 changed to end-of-life (EOL) status on 2013-07-30. Fedora 17 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 13 william.garber 2013-12-11 21:54:14 UTC
This bug still exists in fedora 19.
There is also a bug

nouveau lockup

https://bugzilla.redhat.com/show_bug.cgi?id=918732

that may be related.
This related bug appears to have something to do
with ACPI.
one recommended fix is to set acpi=force in kernel boot args.

Comment 14 william.garber 2013-12-11 22:37:25 UTC
Created attachment 835511 [details]
debugging logs

Comment 15 william.garber 2013-12-11 22:40:19 UTC
Created attachment 835512 [details]
debugging logs

debug rd.debug rd.break=shutdown kernel options
reboot -f -d -n --no-wall
files generated by
/run/initr4amfs/sosreport.txt
journalctl > journalctl.log 2>&1

Comment 16 william.garber 2013-12-11 22:53:22 UTC
could have something to do with root partition using LVM.
could have something to do with fsck on next boot.

Comment 17 william.garber 2013-12-12 00:58:07 UTC
I hope a workaround was to edit /etc/default/grub

GRUB_CMDLINE_LINUX="rd.lvm.lv=fedora/swap rd.md=0 rd.dm=0 vconsole.keymap=us $([ -x /usr/sbin/rhcra\
shkernel-param ] && /usr/sbin/rhcrashkernel-param || :) rd.luks=0 vconsole.font=latarcyrheb-sun16 r\
d.lvm.lv=fedora/root quiet acpi=force apm=power_off reboot=pci"

to add 
acpi=force apm=power_off reboot=pci

Not sure what these do.

Comment 18 william.garber 2013-12-12 02:21:42 UTC
try the options below in order:

http://www.novell.com/support/kb/doc.php?id=7009779
===========================

Situation
During a shutdown or reboot the system will shutdown appropriately but at the point where it shou\
ld power off or begin the reboot it will hang.  The power must be manually turned off or cycled t\
o boot the system back up.
Resolution
The kernel has a "reboot" parameter that will generally fix the problem.  Each of the options can\
 be tested on bootup of the system by adding the parameter to the "Boot Options" in the GRUB menu\
.  Here is a list of all the options:

warm =  Don’t set the cold reboot flag
cold = Set the cold reboot flag
bios = Reboot by jumping through the BIOS (only for X86_32)
smp = Reboot by executing reset on BSP or other CPU (only for X86_32)
triple = Force a triple fault (init)
kbd = Use the keyboard controller. cold reset (this is the default)
acpi = Use the RESET_REG in the FADT
efi = Use efi reset_system runtime service
pci = Use the so-called “PCI reset register”, CF9
force = Avoid anything that could hang.

Most of the time the problem can be fixed by using one of the following two parameters:

reboot=bios
reboot=acpi

Try these first then move on to the others if needed.