Bug 1158465 - [3.4-6.6][dracut] Kernel panic after rhevh installation reboot.
Summary: [3.4-6.6][dracut] Kernel panic after rhevh installation reboot.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-node
Version: 3.4.3
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: ---
: 3.5.0
Assignee: Fabian Deutsch
QA Contact: Virtualization Bugs
URL:
Whiteboard: node
Depends On: 1161520 1175888
Blocks: rhev35rcblocker rhev35gablocker 1166578
TreeView+ depends on / blocked
 
Reported: 2014-10-29 12:55 UTC by Ying Cui
Modified: 2016-02-10 20:03 UTC (History)
14 users (show)

Fixed In Version: rhev-hypervisor6-6.6-20141218.0.iso rhev-hypervisor7-7.0-20141218.0.iso
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1166578 (view as bug list)
Environment:
Last Closed: 2015-02-11 21:05:28 UTC
oVirt Team: Node
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
kernel_panic.png (7.66 KB, image/png)
2014-10-29 12:55 UTC, Ying Cui
no flags Details
correct_kernel_panic.png for comment 6 (9.06 KB, image/png)
2014-10-30 09:53 UTC, Ying Cui
no flags Details
console_log_output.log (29.49 KB, text/plain)
2014-11-03 09:45 UTC, Ying Cui
no flags Details
init.log (381.13 KB, text/plain)
2014-11-05 11:00 UTC, Ying Cui
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2015:0160 0 normal SHIPPED_LIVE ovirt-node bug fix and enhancement update 2015-02-12 01:34:52 UTC
oVirt gerrit 35160 0 master MERGED recipe: Fix udev patch 2020-10-12 18:14:47 UTC
oVirt gerrit 35175 0 node-3.0 MERGED recipe: Fix udev patch 2020-10-12 18:14:47 UTC

Description Ying Cui 2014-10-29 12:55:13 UTC
Created attachment 951769 [details]
kernel_panic.png

Description of problem:
TUI Install RHEVH 6.6 for 3.4.z build, after finish installation, then reboot RHEVH, Kernel panic occurs on booting rhevh.
kernel panic init not tainted 2.6.32-504.1.2.el6x86_64. see kernel_panic.png
but in this rhevh, the kernel should be kernel-2.6.32-504.el6.x86_64.

Version-Release number of selected component (if applicable):
rhev-hypervisor6-6.6-20141027.0.iso
ovirt-node-3.0.1-19.el6.22.noarch
kernel-2.6.32-504.el6.x86_64
dracut-kernel-004-356.el6.noarch


How reproducible:
50% 

Steps to Reproduce:
1. Go to the TUI install, 
2. Set installation one by one steps.
3. All are set done.
4. _Reboot_ RHEVH.
5. machine restart.
6. Kernel panic occurs on booting rhevh.
 

Actual result:
1. kernel panic occurs. can not boot rhevh successful, nothing we can do on page. 

Expect result:
1. After installation, rhevh boot successful.

Additional info:
1. no related clean and dirty TUI install, all happened this panic.
2. no related media PXE and CDROM.
3. local disk for rhevh installation.

Comment 1 Ying Cui 2014-10-29 12:56:52 UTC
This issue occur on our SVVP HP big machine, so set it as Testblocker.

Comment 6 Ying Cui 2014-10-30 09:52:04 UTC
This issue still happen in rhev-hypervisor6-6.6-20141030.0.iso

kernel-2.6.32-504.el6.x86_64, with correct screenshot, see: correct_kernel_panic.png.

# md5sum rhev-hypervisor6-6.6-20141030.0.iso
0dc7f6eddeada6080fa33f49ad3d2564  rhev-hypervisor6-6.6-20141030.0.iso

Comment 7 Ying Cui 2014-10-30 09:53:19 UTC
Created attachment 952041 [details]
correct_kernel_panic.png for comment 6

Comment 8 Fabian Deutsch 2014-10-30 10:28:46 UTC
Sorted out on IRC, now the request:

1. boot into grub
2. add rdshell to the cmdline and remove the quiet arg
3. boot
4. get rdsosreport and attach it to this bug

Comment 9 Ying Cui 2014-10-30 12:25:00 UTC
Test 10 times to get this /run/initramfs/rdsosreport.txt by rd.debug rdshell.
but this txt is hard to generate, I also confirmed with cshao he tested in bug 1152948 and hard to get the txt too.

I continue trying to get this txt now, I will update the bug if it is generated.

Comment 10 Ying Cui 2014-10-30 12:30:37 UTC
This kernel panic issue did not occur on 100% even on the same machine. 
Today, we have setup our SVVP test host on build rhev-hypervisor6-6.6-20141030.0.iso successful. So I remove the testblocker keywords.

But this bug still critical as urgent, still need our attention.

Comment 11 Fabian Deutsch 2014-10-30 13:26:36 UTC
(In reply to Ying Cui from comment #9)
> Test 10 times to get this /run/initramfs/rdsosreport.txt by rd.debug rdshell.
> but this txt is hard to generate, I also confirmed with cshao he tested in
> bug 1152948 and hard to get the txt too.
> 
> I continue trying to get this txt now, I will update the bug if it is
> generated.

Harald, we noticed that it sometimes takes several (up to 20) tries to get an rdsosreport on RHEL 7. Is this a known isuse or rather unusual?

Comment 14 Ying Cui 2014-11-03 09:42:51 UTC
Tested more times(>20+), still can not generate /run/initramfs/rdsosreport.txt.
Attached console output log here(console_log_output.log) by removing quiet and add rd.debug rdshell arguments.

dracut:/# blkid
/dev/sda1: UUID="94a841aa-f6cf-4b45-ba70-2ee88af6ed9d" TYPE="ext4" 
/dev/sda2: LABEL="RootBackup" UUID="14ac0fec-e950-4bda-86c5-8941799a8ee2" TYPE="ext2" 
/dev/sda3: LABEL="RootUpdate" UUID="addc5468-442c-4f65-8fa6-8befbb3ae376" TYPE="ext2" 
/dev/sda4: UUID="mT5SLe-09vd-gxU8-FyWr-tPB3-YlEf-19VluH" TYPE="LVM2_member" 
/dev/mapper/35000c5004b5b6271p1: UUID="94a841aa-f6cf-4b45-ba70-2ee88af6ed9d" TYPE="ext4" 
/dev/mapper/35000c5004b5b6271p2: LABEL="RootBackup" UUID="14ac0fec-e950-4bda-86c5-8941799a8ee2" TYPE="ext2" 
/dev/mapper/35000c5004b5b6271p3: LABEL="RootUpdate" UUID="addc5468-442c-4f65-8fa6-8befbb3ae376" TYPE="ext2" 
/dev/mapper/35000c5004b5b6271p4: UUID="mT5SLe-09vd-gxU8-FyWr-tPB3-YlEf-19VluH" TYPE="LVM2_member" 
/dev/mapper/HostVG-Swap: LABEL="SWAP" UUID="efb0e743-d171-46c5-8e76-cd201e7fc0b6" TYPE="swap" 
/dev/mapper/HostVG-Config: LABEL="CONFIG" UUID="7f9f91f0-59d1-4aee-a6d9-8212e568a686" TYPE="ext4" 
/dev/mapper/HostVG-Logging: LABEL="LOGGING" UUID="d24ff0f0-ecf3-4557-9809-01e07cbb59fd" TYPE="ext4" 
/dev/mapper/HostVG-Data: LABEL="DATA" UUID="b5b8c5c9-ae0e-48e0-a72b-61448009a477" TYPE="ext4"

Comment 15 Ying Cui 2014-11-03 09:45:21 UTC
Created attachment 953040 [details]
console_log_output.log

console_log_output.log for comment 14.

Comment 16 Fabian Deutsch 2014-11-03 20:35:08 UTC
Hey Harald, could someone familiar with dracut help us with this bug?(In reply to Fabian Deutsch from comment #11)
> (In reply to Ying Cui from comment #9)
> > Test 10 times to get this /run/initramfs/rdsosreport.txt by rd.debug rdshell.
> > but this txt is hard to generate, I also confirmed with cshao he tested in
> > bug 1152948 and hard to get the txt too.
> > 
> > I continue trying to get this txt now, I will update the bug if it is
> > generated.
> 
> Harald, we noticed that it sometimes takes several (up to 20) tries to get
> an rdsosreport on RHEL 7. Is this a known isuse or rather unusual?

rdsosreport is not available on RHEL 6.6.

Comment 18 Harald Hoyer 2014-11-04 14:33:41 UTC
Command line: root=live:LABEL=Root ro rootfstype=auto rootflags=ro ksdevice=bootif RD_NO_LVM  crashkernel=128M lang= rd_NO_LUKS max_loop=256 kssendmac rd_NO_MD console=ttyS0,115200n8 rd.debug rdshell elevator=dM


General procedure for debugging RHEL-6 dracut problems:
http://people.redhat.com/harald/dracut-rhel6.html#troubleshooting

Please attach:
/init.log

According to comment 14 there is no partition with LABEL=Root corresponding to the "root=live:LABEL=Root" kernel command line

Comment 19 Ying Cui 2014-11-05 11:00:17 UTC
Created attachment 953992 [details]
init.log

Command line: root=live:LABEL=Root ro rootfstype=auto rootflags=ro ksdevice=bootif RD_NO_LVM  crashkernel=128M lang= rd_NO_LUKS max_loop=256 kssendmac rd_NO_MD elevator=deadline rd_NO_DM rdshell rdinitdebug console=ttyS0,115200n8

init.log as attached.

Comment 20 Ying Cui 2014-11-05 11:06:17 UTC
=== snip init.log ====
+ [ -e /initqueue-timeout/*.sh ]
+ break
+ main_loop=81
+ [ 81 -gt 80 ]
+ flock -s 9
+ emergency_shell No root device "block:/dev/mapper/live-rw" found
+ set +e
+ [ No root device "block:/dev/mapper/live-rw" found = -n ]
+ _rdshell_name=dracut
+ wait_for_loginit
=== snip init.log ====

blkid located in comment 14.

Comment 23 Fabian Deutsch 2014-11-06 16:12:23 UTC
Lowering the priority due to the lower reporudcability

Comment 45 Fabian Deutsch 2015-01-05 15:57:17 UTC
A problem was found with the current solution, currently always the wwid of the first device will be used, which is obviously wrong.

Comment 47 Ying Cui 2015-01-21 04:28:27 UTC
This bug flag is 3.5.0, so I verified this bug with following versions:

# cat /etc/system-release
Red Hat Enterprise Virtualization Hypervisor release 7.0 (20150114.0.el7ev)
# rpm -q ovirt-node kernel
ovirt-node-3.2.1-4.el7.noarch
kernel-3.10.0-123.13.2.el7.x86_64

TUI clean install and TUI reinstall on hp-z600-03, 10 times, did not encounter this issue.
TUI clean install and TUI reinstall on ibm-x3650m3-02, 5 times, did not encounter this issue.
TUI clean install and TUI reinstall on dell-9010, 10 times, did not encounter this issue.

Test version:
rhev-hypervisor6-6.6-20150114.0
ovirt-node-3.2.1-4.el6.noarch
TUI clean install and TUI reinstall on hp-z600-03, 5 times, did not encounter this issue.
TUI clean install and TUI reinstall on ibm-x3650m3-02, 2 times, did not encounter this issue.
TUI clean install and TUI reinstall on dell-9010, 10+ times, did not encounter this issue.

with above testing, this bug is fixed yet. so move it to VERIFIED.

Comment 49 errata-xmlrpc 2015-02-11 21:05:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2015-0160.html


Note You need to log in before you can comment on or make changes to this bug.