Bug 595440

Summary: kernel panic for RHEL6 Xen guest/boot.iso on RHEL5.5 dom0
Product: Red Hat Enterprise Linux 6 Reporter: Alexander Todorov <atodorov>
Component: kernelAssignee: Red Hat Kernel Manager <kernel-mgr>
Status: CLOSED DUPLICATE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: high    
Version: 6.0CC: drjones, mjenner, rwilliam, syeghiay
Target Milestone: beta   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-05-28 15:03:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
The full console.log from the domU none

Description Alexander Todorov 2010-05-24 15:51:30 UTC
Description of problem:
Trying to start RHEL6 snap5 (0523.0) as FV xen guest on sun-x4440-01.rhts.eng.bos.redhat.com (RHEL5.5 GA, x86_64) results in kernel panic in the guest.

Version-Release number of selected component (if applicable):
host: RHEL 5.5 GA, x86_64, 
guest: RHEL6.0-20100523.0/Server/x86_64, kernel 2.6.32-28

How reproducible:
Always on that particular host

Steps to Reproduce:
1. Using virt-manager create new guest
2. Select Linux/RHEL6 for OS type/variant, select to boot from boot.iso
3. Proceed with default settings.
  
Actual results:
Guest shows the inital grub menu. When the installer boots there's kernel panic. 

Expected results:
Guest boots into anaconda. 

Additional info:
Looks similar to bug 570496. Also on the guest console I see this timeout: 
XENBUS: Waiting for devices to initialise: 295s...290s...285s...280s...275s...270s...265s...260s...255s...250s...245s...240s...235s...230s...225s...220s...215s...210s...205s...200s...195s...190s...185s...180s...175s...170s...165s...160s...155s...150s...145s...140s...135s...130s...125s...120s...115s...110s...105s...100s...95s...90s...85s...80s...75s...70s...65s...60s...55s...50s...45s...40s...35s...30s...25s...20s...15s...10s...5s...0s...

which is related to bug 396621


---- the last few console lines ---
XENBUS: Waiting for devices to initialise: 295s...290s...285s...280s...275s...270s...265s...260s...255s...250s...245s...240s...235s...230s...225s...220s...215s...210s...205s...200s...195s...190s...185s...180s...175s...170s...165s...160s...155s...150s...145s...140s...135s...130s...125s...120s...115s...110s...105s...100s...95s...90s...85s...80s...75s...70s...65s...60s...55s...50s...45s...40s...35s...30s...25s...20s...15s...10s...5s...0s...
BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
IP: [<ffffffff814b3786>] klist_next+0x26/0xf0
PGD 0 
Oops: 0000 [#1] SMP 
last sysfs file: 
CPU 0 
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.32-28.el6.x86_64 #1 HVM domU
RIP: 0010:[<ffffffff814b3786>]  [<ffffffff814b3786>] klist_next+0x26/0xf0
RSP: 0018:ffff88003caf5e30  EFLAGS: 00010286
RAX: 0000000000000001 RBX: ffff88003caf5e70 RCX: ffffffff812e40b0
RDX: 0000000000000000 RSI: ffff88003caf5e70 RDI: 0000000000000070
RBP: ffff88003caf5e60 R08: 00000000ffffffff R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff88003caf5e70
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000026780
FS:  0000000000000000(0000) GS:ffff880001e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000090 CR3: 0000000001001000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff88003caf4000, task ffff88003caf34a0)
Stack:
 000000000000012c 0000000000000000 ffff88003caf5e70 ffffffff812e40b0
<0> 0000000000000000 0000000000026780 ffff88003caf5ea0 ffffffff8132778f
<0> 0000000000000070 0000000000000000 000000000000012c 00000000fffb7516
Call Trace:
 [<ffffffff812e40b0>] ? print_device_status+0x0/0xa0
 [<ffffffff8132778f>] bus_for_each_dev+0x6f/0x90
 [<ffffffff812e46fc>] wait_for_devices+0xbc/0x120
 [<ffffffff818fab7c>] ? boot_wait_for_devices+0x0/0x22
 [<ffffffff818fab9a>] boot_wait_for_devices+0x1e/0x22
 [<ffffffff8100a04c>] do_one_initcall+0x3c/0x1d0
 [<ffffffff818c8818>] kernel_init+0x246/0x29c
 [<ffffffff810141ca>] child_rip+0xa/0x20
 [<ffffffff818c85d2>] ? kernel_init+0x0/0x29c
 [<ffffffff810141c0>] ? child_rip+0x0/0x20
Code: c9 c3 0f 1f 00 55 48 89 e5 48 83 ec 30 48 89 5d d8 48 89 fb 4c 89 6d e8 4c 89 65 e0 4c 89 75 f0 4c 89 7d f8 4c 8b 6b 08 48 8b 3f <4c> 8b 77 20 e8 a1 a5 01 00 4d 85 ed 0f 84 88 00 00 00 4d 8b 65 
RIP  [<ffffffff814b3786>] klist_next+0x26/0xf0
 RSP <ffff88003caf5e30>
CR2: 0000000000000090
---[ end trace 273c9f3df0cceb32 ]---
Kernel panic - not syncing: Fatal exception
Pid: 1, comm: swapper Tainted: G      D    2.6.32-28.el6.x86_64 #1
Call Trace:
 [<ffffffff814cae34>] panic+0x78/0x137
 [<ffffffff814ceedc>] oops_end+0xdc/0xf0
 [<ffffffff8104344b>] no_context+0xfb/0x260
 [<ffffffff814d0afa>] ? atomic_notifier_call_chain+0x1a/0x20
 [<ffffffff810436d5>] __bad_area_nosemaphore+0x125/0x1e0
 [<ffffffff810437a3>] bad_area_nosemaphore+0x13/0x20
 [<ffffffff814d094f>] do_page_fault+0x2ef/0x3e0
 [<ffffffff814ce235>] page_fault+0x25/0x30
 [<ffffffff812e40b0>] ? print_device_status+0x0/0xa0
 [<ffffffff814b3786>] ? klist_next+0x26/0xf0
 [<ffffffff812e40b0>] ? print_device_status+0x0/0xa0
 [<ffffffff8132778f>] bus_for_each_dev+0x6f/0x90
 [<ffffffff812e46fc>] wait_for_devices+0xbc/0x120
 [<ffffffff818fab7c>] ? boot_wait_for_devices+0x0/0x22
 [<ffffffff818fab9a>] boot_wait_for_devices+0x1e/0x22
 [<ffffffff8100a04c>] do_one_initcall+0x3c/0x1d0
 [<ffffffff818c8818>] kernel_init+0x246/0x29c
 [<ffffffff810141ca>] child_rip+0xa/0x20
 [<ffffffff818c85d2>] ? kernel_init+0x0/0x29c
 [<ffffffff810141c0>] ? child_rip+0x0/0x20

Comment 1 Alexander Todorov 2010-05-24 15:52:47 UTC
Created attachment 416161 [details]
The full console.log from the domU

Comment 2 Andrew Jones 2010-05-24 16:26:07 UTC
This is a known issue, and a patch has already been sent to the list to fix it. It should be fixed in the next build.

Comment 3 Alexander Todorov 2010-05-28 14:16:41 UTC
FYI: This is still present in snap #5 - 0527.0