Bug 1417791

Summary: Hang on reboot, watchdog did not stop
Product: [Fedora] Fedora Reporter: David Highley <david.m.highley>
Component: dracutAssignee: dracut-maint-list
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 25CC: david.m.highley, dracut-maint-list, harald, iamreallynotapokemon, jonathan, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-07-12 08:49:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Highley 2017-01-31 03:34:23 UTC
Description of problem:
System will hang for 10 minutes until hardware watchdog timer trips.

Version-Release number of selected component (if applicable):
dracut-044-78.fc25.x86_64

How reproducible:
Every reboot

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Appears as the same/similar issue reported in Fedora 24 bug report 1365352. Where in Fedora 24 we had four systems with this issue with Fedora 25 we have two out of four systems. The difference between the two that hang on reboot and the two that do not is the amount of SSD storage. The two failing systems have dual 128GB SSD that are raid 1 while the two that do not have the issue have dual 1TB SSD that are raid 1. The other failure on these two systems is rpcbind fails to start on both systems. The rpcbind issue did not occur when this hardware was running Fedora 24.

Jan 30 18:32:59 spruce audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4
294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-sysctl comm="system
d" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jan 30 18:32:59 spruce systemd[1]: Started Apply Kernel Variables.
Jan 30 18:32:59 spruce rpcbind[727]: rpcbind: /run/rpcbind/rpcbind.lock: No such
 file or directory
Jan 30 18:32:59 spruce systemd[1]: rpcbind.service: Main process exited, code=ex
ited, status=1/FAILURE
Jan 30 18:32:59 spruce audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4
294967295 subj=system_u:system_r:init_t:s0 msg='unit=rpcbind comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Jan 30 18:32:59 spruce systemd[1]: Failed to start RPC Bind.
Jan 30 18:32:59 spruce systemd[1]: rpcbind.service: Unit entered failed state.
Jan 30 18:32:59 spruce systemd[1]: rpcbind.service: Failed with result 'exit-code'.

Comment 1 Harald Hoyer 2017-05-18 14:01:42 UTC
What is your kernel cmdline?

Comment 2 David Highley 2017-05-29 18:15:53 UTC
We don't know and here is why we say that. We have four systems and two have the file /boot/efi/EFI/fedora/grub.cfg and two do not have the file. The two are different and we have never messed with the booting process so everything should be default.

Comment 3 Harald Hoyer 2017-06-29 13:51:06 UTC
I need more information. Boot with "rd.debug" on the kernel command line, then attach the output of:

# journalctl -b -o short-monotonic 

to this bug please.

Comment 4 David Highley 2017-07-11 01:52:01 UTC
The issue has disappeared with the newer kernel updates. We have seen this with some of the fedora 24 kernels and some of the fedora 25 kernels. So far we have not been able to get any information as to cause. But for now it is gone again.