Bug 433952

Summary: kernel-xenU oops running Oracle Enterprise Manager 10g
Product: Red Hat Enterprise Linux 4 Reporter: Stanislav Polasek <stanislav.polasek>
Component: kernel-xenAssignee: Xen Maintainance List <xen-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Martin Jenner <mjenner>
Severity: medium Docs Contact:
Priority: low    
Version: 4.6CC: simon
Target Milestone: rc   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: RHEL 5.2 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-07-02 20:05:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Configuration of virtual machine none

Description Stanislav Polasek 2008-02-22 11:33:54 UTC
Description of problem:
kernel oops

Version-Release number of selected component (if applicable):
2.6.9-67.0.4.ELxenU (tested also on 2.6.9-67.ELxenU, the same results)

How reproducible:
Allways

Steps to Reproduce:
1. install rhel 4.6 + xenU kernel (on rhel 5.1 kernel 2.6.18-53.1.6.el5xen)
2. run oem
3. after a few minutes -> oops
  
Actual results:
kernel oops. machine freezes

Expected results:
No oops

Additional info:
The same setup works without problems in HVM setup.

OOPS:

Feb 22 12:13:31 smbxx10a kernel: Unable to handle kernel paging request at 
virtual address 371e4f44
Feb 22 12:13:31 smbxx10a kernel:  printing eip:
Feb 22 12:13:31 smbxx10a kernel: c01493b1
Feb 22 12:13:31 smbxx10a kernel: 2caad000 -> *pde = 00000000:4b929027
Feb 22 12:13:31 smbxx10a kernel: 2cafe000 -> *pme = 00000000:00000000
Feb 22 12:13:31 smbxx10a kernel: Oops: 0002 [#1]
Feb 22 12:13:31 smbxx10a kernel: SMP
Feb 22 12:13:31 smbxx10a kernel: Modules linked in: dlm(U) cman(U) md5 ipv6 
xennet dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod xenblk sd_mod scsi_mod
Feb 22 12:13:31 smbxx10a kernel: CPU:    0
Feb 22 12:13:31 smbxx10a kernel: EIP:    0061:[<c01493b1>]    Not tainted VLI
Feb 22 12:13:31 smbxx10a kernel: EFLAGS: 00010206   (2.6.9-67.0.4.ELxenU)
Feb 22 12:13:31 smbxx10a kernel: EIP is at copy_page_range+0x491/0x52c
Feb 22 12:13:31 smbxx10a kernel: eax: 371e4f40   ebx: 04a7d065   ecx: 
ecb17a40   edx: 04a7d045
Feb 22 12:13:31 smbxx10a kernel: esi: 00000000   edi: c17b2a40   ebp: 
00040800   esp: ecba7ec0
Feb 22 12:13:31 smbxx10a kernel: ds: 007b   es: 007b   ss: 0068
Feb 22 12:13:31 smbxx10a kernel: Process modclusterd (pid: 2204, 
threadinfo=ecba7000 task=ec4074a0)
Feb 22 12:13:31 smbxx10a kernel: Stack: 53932067 00000000 371e4f40 ec295f40 
d5166df8 ecafcdf8 00000001 b7ff0000
Feb 22 12:13:31 smbxx10a kernel:        b7fe8000 d4da7010 ecaad010 ecb17a84 
dbe9e4ec eca89cd4 ecb17a40 c011a6b8
Feb 22 12:13:31 smbxx10a kernel:        ecb17a40 ec40f580 dbe9e4ec 00000a05 
00000000 dbe9e508 dbe9e510 dbe9e4f8
Feb 22 12:13:31 smbxx10a kernel: Call Trace:
Feb 22 12:13:31 smbxx10a kernel:  [<c011a6b8>] copy_mm+0x2d9/0x396
Feb 22 12:13:31 smbxx10a kernel:  [<c011b26a>] copy_process+0x6b5/0xb0b
Feb 22 12:13:31 smbxx10a kernel:  [<c011b7ad>] do_fork+0x8a/0x16b
Feb 22 12:13:31 smbxx10a kernel:  [<c0105cff>] sys_clone+0x24/0x28
Feb 22 12:13:31 smbxx10a kernel:  [<c010734f>] syscall_call+0x7/0xb
Feb 22 12:13:31 smbxx10a kernel: Code: 89 f9 83 e2 df f6 c4 80 74 03 8b 4f 0c 
f0 ff 41 04 8b 4c 24 40 ff 81 80 00 00 00 f6 47 10 01 74 06 ff 81 84 00 00 00 
8b 44 24 08 <89> 70 04 89 10 f0 ff 47 08 81 44 24 20 00 10 00 00 8b 54 24 1c
Feb 22 12:13:31 smbxx10a kernel:  <0>Fatal exception: panic in 5 seconds

Comment 1 Stanislav Polasek 2008-02-22 11:33:54 UTC
Created attachment 295617 [details]
Configuration of virtual machine

Comment 2 Chris Lalancette 2008-02-22 20:01:12 UTC
Interesting.  What kind of test is oem, out of curiousity?  I assume some
database workload, but do you know if it is I/O, memory, CPU bound?

Chris Lalancette

Comment 3 Stanislav Polasek 2008-02-22 21:35:42 UTC
It's the Oracle Enterprise Manager 10g, sorry :) In fact, its just the 
application part, the database is stored on another host. It runs for a few 
minutes, and then the kernel oops. Curiously, the oops comes when oem tests for 
the active instances of itself, running oemctl status.

Comment 4 Chris Lalancette 2008-02-25 19:54:11 UTC
Actually, now that you mention it, this crash looks surprising similar to the
stack traces we see when trying to run i386 PV guests on x86_64 RHEL-5.1 dom0. 
Is that what you are trying here?  This combination is currently known not to work.

Chris Lalancette

Comment 5 Stanislav Polasek 2008-03-03 06:39:36 UTC
Yes, that's exactly what I am doing. It worked for me with a lot of other
situations, so I completely forgot it's the tech preview. Sorry.

Comment 6 Bill Burns 2008-07-02 20:05:56 UTC
This should work on RHEL 5.2. While it is still tech preview it is due to issues
with save/restore and migration. But running an i386 PV guest on x86_64 works
pretty well otherwise. Closing this bug as fixed in current release.