Bug 859958 - Hang at shutdown of F17 upgrade - one DM device not detatching
Hang at shutdown of F17 upgrade - one DM device not detatching
Status: CLOSED WONTFIX
Product: Fedora
Classification: Fedora
Component: dracut (Show other bugs)
17
x86_64 Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: dracut-maint
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-09-24 09:32 EDT by jgforbes
Modified: 2013-12-11 21:21 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-07-31 19:59:33 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
william.garber: needinfo+


Attachments (Terms of Use)
Photo of screen output in debug shell (972.94 KB, image/jpeg)
2012-09-25 11:00 EDT, jgforbes
no flags Details
debugging logs (214.57 KB, text/plain)
2013-12-11 17:37 EST, william.garber
no flags Details
debugging logs (206.13 KB, text/plain)
2013-12-11 17:40 EST, william.garber
no flags Details

  None (edit)
Description jgforbes 2012-09-24 09:32:10 EDT
Description of problem:
I did an upgrade from F15 to F17 an now the system hangs on shutdown.
The final output to the console is similar to that in comment #3 of Bug #831634
as copied below:

Sending SIGTERM to remaining processes...
Sending SIGKILL to remaining processes...
Unmounting file systems.
Unmounted /bgoot.
Unmounted /proc/fs/nsfd.
Unmounted /dev/mqueue.
Unmounted /sys/kernel/config.
Unmounted /dev/homepages.
Unmounted /sys/kernel/debug.
Disabling swaps.
Detaching loop devices.
Detaching DM devices.
Not all DM devices detached, 1 left.
Detaching DM devies.
Not all DM devices detached, 1 left.
Cannot finalize remaining file systems and devices, trying to kill remaining processes.
Not all DM devices detached, 1 left.
Unmounted /oldroot/proc.
Unmounted /oldroot/dev/pts.
Unmounted /oldroot/run.
Unmounted /oldroot/sys/fs/selinux.
Unmounted /oldroot/sys/fs/cgrop/systemd.
Unmounted /oldroot/sys/fs/cgroup/memory.
Unmounted /oldroot/sys/fs/cgroup/freezer.
Unmounted /oldroot/sys/fs/cgroup/blkio.
Unmounted /oldroot/dev/shm.
Unmounted /oldroot/sys/kernel/security.
Unmounted /oldroot/sys/fs/cgroup/cpu,cpuacct.
Unmounted /oldroot/dev.
Unmounted /oldroot/sys/fs/cgroup/devices.
Unmounted /oldroot/sys/fs/cgroup/cpuset.
Unmounted /oldroot/sys/fs/cgroup/perf_event.
Unmounted /oldroot/sys/fs/csgroup.
Unmounted /oldroot/sys.
Unmounted /oldroot.

The difference is that it stops there.

When I did the bare metal install of F15 it also had shutdown problems
which were eventually rectified by one of the updates.

I have an Intel based motherboard with an i5 CPU and using the Intel
raid system.

Not sure if this is a problem with dracut as I have the latest version (see below).

Version-Release number of selected component (if applicable):
kernel  3.5.4-1.fc17.x86_64
dracut-018-98.git20120813.fc17.noarch

How reproducible:
Every time at shutdown/poweroff


Additional info:
Comment 1 Harald Hoyer 2012-09-24 09:57:21 EDT
(In reply to comment #0)
> Description of problem:
> I did an upgrade from F15 to F17 an now the system hangs on shutdown.
> The final output to the console is similar to that in comment #3 of Bug
> #831634
> as copied below:
> 
> Sending SIGTERM to remaining processes...
> Sending SIGKILL to remaining processes...
...
> Cannot finalize remaining file systems and devices, trying to kill remaining
> processes.
> Not all DM devices detached, 1 left.

This is no problem.

> Unmounted /oldroot/proc.
> Unmounted /oldroot/dev/pts.
> Unmounted /oldroot/run.
> Unmounted /oldroot/sys/fs/selinux.
> Unmounted /oldroot/sys/fs/cgrop/systemd.
> Unmounted /oldroot/sys/fs/cgroup/memory.
> Unmounted /oldroot/sys/fs/cgroup/freezer.
> Unmounted /oldroot/sys/fs/cgroup/blkio.
> Unmounted /oldroot/dev/shm.
> Unmounted /oldroot/sys/kernel/security.
> Unmounted /oldroot/sys/fs/cgroup/cpu,cpuacct.
> Unmounted /oldroot/dev.
> Unmounted /oldroot/sys/fs/cgroup/devices.
> Unmounted /oldroot/sys/fs/cgroup/cpuset.
> Unmounted /oldroot/sys/fs/cgroup/perf_event.
> Unmounted /oldroot/sys/fs/csgroup.
> Unmounted /oldroot/sys.
> Unmounted /oldroot.
> 
> The difference is that it stops there.

The problem is that it does not poweroff here.
Comment 2 Harald Hoyer 2012-09-24 10:02:51 EDT
Please boot with "rd.debug rd.break=shutdown" on the kernel command line and see, if you can get a shell on shutdown.

Then try:

# reboot -f -d -n --no-wall

or

# poweroff -f -d -n --no-wall
Comment 3 jgforbes 2012-09-24 21:37:10 EDT
Adding "rd.debug rd.break=shutdown"  (with or without the quotes) causes the system to boot into a debug shell.
I assume that this is not the expected behavior.


When booted without the kernel command,
 # reboot -f -d -n --no-wall 
causes the system to reboot without hanging.
Comment 4 Harald Hoyer 2012-09-25 09:11:36 EDT
(In reply to comment #3)
> Adding "rd.debug rd.break=shutdown"  (with or without the quotes) causes the
> system to boot into a debug shell.


> I assume that this is not the expected behavior.

Yes, it is the expected behaviour.

> 
> 
> When booted without the kernel command,
>  # reboot -f -d -n --no-wall 
> causes the system to reboot without hanging.

ah, good... can you boot only with "rd.debug" and take a photo from the last messages on reboot?
Comment 5 jgforbes 2012-09-25 11:00:08 EDT
Created attachment 617075 [details]
Photo of screen output in debug shell
Comment 6 jgforbes 2012-09-25 11:03:44 EDT
Booting with this kernel command line boots into the debug shell as shown in the photo
linux   /vmlinuz-3.5.4-1.fc17.x86_64 root=/dev/mapper/vg_fastdog-lv_root ro LANG=en_US.UTF-8  KEYTABLE=us rd.md.uuid=ba9a08af:dee20819:dc0ae0ad:abeedaf5 rd.lvm.lv=vg_fastdog/lv_root rd.md.uuid=8ece37b6:1e3cb7e2:5ee1030b:41b18ac8 rd.luks=0 rd.lvm.lv=vg_fastdog/lv_swap SYSFONT=latarcyrheb-sun16 rd.dm=0 rhgb quiet rd.debug rd.break=shutdown

Removing the rd.break=shutdown results in booting into the fedora desktop.
Comment 7 Harald Hoyer 2012-09-26 09:15:49 EDT
(In reply to comment #6)
> Booting with this kernel command line boots into the debug shell as shown in
> the photo
> linux   /vmlinuz-3.5.4-1.fc17.x86_64 root=/dev/mapper/vg_fastdog-lv_root ro
> LANG=en_US.UTF-8  KEYTABLE=us rd.md.uuid=ba9a08af:dee20819:dc0ae0ad:abeedaf5
> rd.lvm.lv=vg_fastdog/lv_root rd.md.uuid=8ece37b6:1e3cb7e2:5ee1030b:41b18ac8
> rd.luks=0 rd.lvm.lv=vg_fastdog/lv_swap SYSFONT=latarcyrheb-sun16 rd.dm=0
> rhgb quiet rd.debug rd.break=shutdown
> 
> Removing the rd.break=shutdown results in booting into the fedora desktop.

Yes, you should boot into the fedora desktop and poweroff then. Debug messages should appear on poweroff also. So we can see why it is not powering off.
Comment 8 jgforbes 2012-09-26 10:32:23 EDT
End of shell output

Unmounted /oldroot.
/shutdown@34(): _cnt=6
/shutdown@32(): '[' 6 -le 40 ']'
/shutdown@33: umount_a

stops here.
System is not dead as plugging in a USB device gives a response.
Also, the screen saver is still working as the screen goes blank and can be reactivated.

Looks like this file is where the wait loop is happening is:
/usr/lib/dracut/modules.d/99shutdown/shutdown.sh
Comment 9 Harald Hoyer 2012-09-27 06:10:48 EDT
(In reply to comment #8)
> End of shell output
> 
> Unmounted /oldroot.
> /shutdown@34(): _cnt=6
> /shutdown@32(): '[' 6 -le 40 ']'
> /shutdown@33: umount_a
> 
> stops here.
> System is not dead as plugging in a USB device gives a response.
> Also, the screen saver is still working as the screen goes blank and can be
> reactivated.
> 
> Looks like this file is where the wait loop is happening is:
> /usr/lib/dracut/modules.d/99shutdown/shutdown.sh

This is very strange because you should definitely see more output after that.
Comment 10 jgforbes 2012-09-27 11:51:00 EDT
I edited /usr/lib/dracut/modules.d/99shutdown/shutdown.sh
to echo the device that it is trying to unmount, the $mp variable.
However, it did not print anything. Does this script actually run or is there
code elsewhere for this functionality?
Comment 11 Fedora End Of Life 2013-07-03 18:15:18 EDT
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.
Comment 12 Fedora End Of Life 2013-07-31 19:59:37 EDT
Fedora 17 changed to end-of-life (EOL) status on 2013-07-30. Fedora 17 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.
Comment 13 william.garber 2013-12-11 16:54:14 EST
This bug still exists in fedora 19.
There is also a bug

nouveau lockup

https://bugzilla.redhat.com/show_bug.cgi?id=918732

that may be related.
This related bug appears to have something to do
with ACPI.
one recommended fix is to set acpi=force in kernel boot args.
Comment 14 william.garber 2013-12-11 17:37:25 EST
Created attachment 835511 [details]
debugging logs
Comment 15 william.garber 2013-12-11 17:40:19 EST
Created attachment 835512 [details]
debugging logs

debug rd.debug rd.break=shutdown kernel options
reboot -f -d -n --no-wall
files generated by
/run/initr4amfs/sosreport.txt
journalctl > journalctl.log 2>&1
Comment 16 william.garber 2013-12-11 17:53:22 EST
could have something to do with root partition using LVM.
could have something to do with fsck on next boot.
Comment 17 william.garber 2013-12-11 19:58:07 EST
I hope a workaround was to edit /etc/default/grub

GRUB_CMDLINE_LINUX="rd.lvm.lv=fedora/swap rd.md=0 rd.dm=0 vconsole.keymap=us $([ -x /usr/sbin/rhcra\
shkernel-param ] && /usr/sbin/rhcrashkernel-param || :) rd.luks=0 vconsole.font=latarcyrheb-sun16 r\
d.lvm.lv=fedora/root quiet acpi=force apm=power_off reboot=pci"

to add 
acpi=force apm=power_off reboot=pci

Not sure what these do.
Comment 18 william.garber 2013-12-11 21:21:42 EST
try the options below in order:

http://www.novell.com/support/kb/doc.php?id=7009779
===========================

Situation
During a shutdown or reboot the system will shutdown appropriately but at the point where it shou\
ld power off or begin the reboot it will hang.  The power must be manually turned off or cycled t\
o boot the system back up.
Resolution
The kernel has a "reboot" parameter that will generally fix the problem.  Each of the options can\
 be tested on bootup of the system by adding the parameter to the "Boot Options" in the GRUB menu\
.  Here is a list of all the options:

warm =  Don’t set the cold reboot flag
cold = Set the cold reboot flag
bios = Reboot by jumping through the BIOS (only for X86_32)
smp = Reboot by executing reset on BSP or other CPU (only for X86_32)
triple = Force a triple fault (init)
kbd = Use the keyboard controller. cold reset (this is the default)
acpi = Use the RESET_REG in the FADT
efi = Use efi reset_system runtime service
pci = Use the so-called “PCI reset register”, CF9
force = Avoid anything that could hang.

Most of the time the problem can be fixed by using one of the following two parameters:

reboot=bios
reboot=acpi

Try these first then move on to the others if needed.

Note You need to log in before you can comment on or make changes to this bug.