Bug 436351 - Running Xen hypervisor inside a fullyvirt guest, crashes the host
Running Xen hypervisor inside a fullyvirt guest, crashes the host
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen (Show other bugs)
5.2
All Linux
high Severity high
: rc
: ---
Assigned To: Bill Burns
Martin Jenner
: Regression
: 436354 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-03-06 13:01 EST by Daniel Berrange
Modified: 2008-05-21 11:11 EDT (History)
2 users (show)

See Also:
Fixed In Version: RHBA-2008-0314
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-05-21 11:11:32 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Posted patch attached. (689 bytes, text/x-patch)
2008-03-12 09:11 EDT, Bill Burns
no flags Details

  None (edit)
Description Daniel Berrange 2008-03-06 13:01:17 EST
Description of problem:
I installed a RHEL-5.1 fullyvirtualized guest, on a RHEL-5.2 host, and changed
grub so that the *guest* boots the kernel-xen + hypervisor. 

Version-Release number of selected component (if applicable):
Host is runing:  kernel-xen-2.6.18-84.el5
Guest is running: kernel-xen-2.6.18-53.1.14

How reproducible:
Always

Steps to Reproduce:
1. Install a RHEL-5.2 host
2. Install a RHEL-5.1 *fullvirt* guest 
3. Login to guest, and install kernel-xen, and configure grub to boot it (ie Xen
hypervisor)
4. Reboot the guest  

Actual results:
When guest starts the Xen hypervisor, the host reboots

Expected results:
Guest can boot Xen hypervisor and run paravirt guests

Additional info:
Comment 1 Chris Lalancette 2008-03-06 13:19:28 EST
Here's the actual stack trace:

(XEN) ----[ Xen-3.1.2-84.el5  x86_64  debug=n  Not tainted ]----
(XEN) CPU:    2
(XEN) RIP:    e008:[<ffff828c801521bc>] missed_ticks+0x1c/0x40
(XEN) RFLAGS: 0000000000010256   CONTEXT: hypervisor
(XEN) rax: 000000000000019a   rbx: ffff8300c6ee9798   rcx: 0000000000000001
(XEN) rdx: 0000000000000000   rsi: 0000000060f583f0   rdi: ffff828c801cb558
(XEN) rbp: ffff8300c6ee9798   rsp: ffff8300c7df7e98   r8:  000008300abf4050
(XEN) r9:  ffff828c801cb540   r10: 0000000000000006   r11: ffff8300c6e16680
(XEN) r12: ffff828c801cb180   r13: 00000830300abf53   r14: ffff830000201a80
(XEN) r15: ffff830000021a00   cr0: 000000008005003b   cr4: 00000000000006f0
(XEN) cr3: 00000002258c4000   cr2: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen stack trace from rsp=ffff8300c7df7e98:
(XEN)    ffff8300c6ee97e0 ffff828c8015220e ffff8300c6ee97e0 ffff8300c6fcd880
(XEN)    ffff828c801cb180 ffff828c80116ea1 0000000000000000 0000000000000002
(XEN)    ffff8300c7df7f28 ffff828c8022e880 ffff828c8022c880 ffff828c80115dd0
(XEN)    000000058e066754 ffff8300c6ee8080 000000058e066754 000000003b902534
(XEN)    00000000000000ff ffff828c8015cba8 ffff830000021a00 ffff830000201a80
(XEN)    00000000000000ff 000000003b902534 000000058e066754 00000000000f424a
(XEN)    0000000000000000 ffff828bfffff010 0000000000000000 6666666666666667
(XEN)    0000000000000000 0000000005f5e4e8 00000000000000d9 000000000000270f
(XEN)    0000000000000000 000000000000000b ffff8300001b8c21 000000000000e008
(XEN)    0000000000000006 ffff8300001e3ea0 0000000000000000 0000000000000180
(XEN)    0000000000000020 000000000e051e30 000000000e00d270 0000000000000002
(XEN)    ffff8300c6ee8080
(XEN) Xen call trace:
(XEN)    [<ffff828c801521bc>] missed_ticks+0x1c/0x40
Comment 2 Chris Lalancette 2008-03-06 13:20:04 EST
*** Bug 436354 has been marked as a duplicate of this bug. ***
Comment 3 Chris Lalancette 2008-03-06 13:24:34 EST
And the actual source code where this crash is:

(gdb) list *(missed_ticks+0x1c)
0xffff828c801521bc is in missed_ticks (vpt.c:53).
48
49          missed_ticks = NOW() - pt->scheduled;
50          if ( missed_ticks <= 0 )
51              return;
52
53          missed_ticks = missed_ticks / (s_time_t) pt->period + 1;
54          if ( missed_ticks > 1000 )
55          {
56              /* TODO: Adjust guest time together */
57              pt->pending_intr_nr++;

Sigh.  I wonder if pt->period was somehow -1, causing a division by 0?

Chris Lalancette
Comment 4 Chris Lalancette 2008-03-06 13:49:21 EST
Somehow I lost the last part of the crash....the end of the crash says this:

(XEN) Xen call trace:
(XEN)    [<ffff828c801521bc>] missed_ticks+0x1c/0x40
(XEN)    [<ffff828c8015220e>] pt_timer_fn+0x2e/0xa0
(XEN)    [<ffff828c80116ea1>] timer_softirq_action+0x81/0xe0
(XEN)    [<ffff828c80115dd0>] do_softirq+0x70/0x80
(XEN)    [<ffff828c8015cba8>] svm_process_softirqs+0x8/0xd
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 2:
(XEN) FATAL TRAP: vector = 0 (divide error)
(XEN) [error_code=0000]
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...

So it does look like a divide by 0 error.

Chris Lalancette
Comment 5 Bill Burns 2008-03-11 08:55:09 EDT
Setting flags, assigning to Bill.
Comment 6 RHEL Product and Program Management 2008-03-11 08:58:55 EDT
This bugzilla has Keywords: Regression.  

Since no regressions are allowed between releases, 
it is also being proposed as a blocker for this release.  

Please resolve ASAP.
Comment 7 Bill Burns 2008-03-12 09:11:24 EDT
Created attachment 297756 [details]
Posted patch attached.

This patch fixes the crash and the Hypervisor and dom0 run fine in a fully virt
guest.
Comment 9 Bill Burns 2008-03-12 16:50:51 EDT
Set devel ack for myself..
Comment 12 Don Zickus 2008-03-19 12:24:50 EDT
in kernel-2.6.18-86.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5
Comment 15 errata-xmlrpc 2008-05-21 11:11:32 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0314.html

Note You need to log in before you can comment on or make changes to this bug.