Bug 458808

Summary: xen crash during boot, auto reboot, xen crash during boot, auto reboot
Product: [Fedora] Fedora Reporter: John Summerfield <debian>
Component: xenAssignee: Xen Maintainance List <xen-maint>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: medium    
Version: 10CC: berrange, linux.ninja1, markmc, virt-maint
Target Milestone: ---Keywords: Reopened, Triaged
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-12-18 01:18:02 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description John Summerfield 2008-08-12 09:55:53 EDT
Description of problem:
xen tries to make an illegal change to CR0.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.Install F10a
2.Boot xen
3.
  
Actual results:

Crash as above

Expected results:
xen to complete initialisation and progress to starting Dom0 Linux

Additional info:(XEN) ******* However it can introduce SIGNIFICANT latencies and affect
(XEN) ******* timekeeping. It is NOT recommended for production use!
(XEN) **********************************************
(XEN) 3... 2... 1...
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to Xen
)
(XEN) Freed 104kB init memory.
hello, world!
xen_init_pt:
xen_init_pt starting
xen_cr3: 0. xen_cur_cr3: 0
page1: ffffffff81837000
addr1: 7983a067
addr1, page2: 7983a000, ffffffff8183a000
addr2: 79839067
addr2,page2: 79839000, ffffffff81839000
l4pgt: ffffffff80201000
xen_init_pt going change init_level4_pgt
pointed l4 to l3kpgt
l3pgt: ffffffff80204000
pud index: 1fe
writing to ffffffff80204ff0
new pud: 78209063
pointed l3 to l2kpgt
l2pgt: ffffffff80209000
copied l2kpgt from page
xen_init_pt: going to pin
xen_init_pt going to make pages readonly
xen_init_pt going to pin pgds
kernel:
user:
xen_init_pt: set_pgd:
xen_init_pt: returning.
going to load new pagetable:
loaded new pagetable
xen_init_pt returned.
(XEN) traps.c:1838:d0 Attempt to change unmodifiable CR0 flags.
(XEN) traps.c:413:d0 Unhandled general protection fault fault/trap [#13] on VCPU
0 [ec=0000]
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-3.2.0  x86_64  debug=n  Tainted:    C ]----
(XEN) CPU:    0
(XEN) RIP:    e033:[<ffffffff8020b385>]
(XEN) RFLAGS: 0000000000000206   CONTEXT: guest
(XEN) rax: 00000000c0050033   rbx: 0000000000000000   rcx: 0000000000000000
(XEN) rdx: ffffffff806925b0   rsi: 0000000000000000   rdi: 00000000c0050033
(XEN) rbp: ffffffff80639e38   rsp: ffffffff80639df0   r8:  0000000000000002
(XEN) r9:  0000000000000070   r10: 0000000000000000   r11: ffffffff80639f2c
(XEN) r12: ffffffff80639ef4   r13: 0000000000000210   r14: 0000000000000070
(XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000026b0
(XEN) cr3: 0000000078201000   cr2: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=ffffffff80639df0:
(XEN)    0000000000000000 ffffffff80639f2c 0000000000000000 ffffffff8020b385
(XEN)    000000010000e030 0000000000010006 ffffffff80639e38 000000000000e02b
(XEN)    ffffffff80639ef4 ffffffff80639e88 ffffffff8021c2e1 0000000000000000
(XEN)    0000000000000000 ffffffff80639f38 00000000804a8a5f 0000000000000008
(XEN)    0000000000000000 ffffffff80639ef4 0000000000000210 ffffffff80639f18
(XEN)    ffffffff80645ee1 0000000000074110 0000000000000001 00000000000003e0
(XEN)    00000000805e1070 0000000000000000 ffffffff80639f18 ffffffff80256190
(XEN)    ffffffff80639f18 ffffffff80256aeb 0000000000000000 ffffffff80639f18
(XEN)    00000000806514f7 ffffffff80639f34 ffffffff80639f30 ffffffff80639f2c
(XEN)    ffffffff80639f28 ffffffff80639f58 ffffffff8064594b 0000000000000000
(XEN)    0000302400000000 ffffffff80639fb0 ffffffff80667390 ffffffffffffffff
(XEN)    0000000000000000 ffffffff80639f98 ffffffff80642f91 ffffffff80639f88
(XEN)    ffffffff80250b5c 0000000000000000 ffffffff80667390 ffffffffffffffff
(XEN)    0000000000000000 ffffffff80639fd8 ffffffff8063ba43 ffffffff80639fc8
(XEN)    ffffffff80669840 0000000000c33918 0000000000000000 0000000000000000
(XEN)    0000000000000000 ffffffff80639ff8 ffffffff80641cfc 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 00cf9b000000ffff
(XEN)    00af9b000000ffff 00cf93000000ffff 00cffb000000ffff 00cff3000000ffff
(XEN)    00affb000000ffff 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) Debugging connection not set up.
(XEN) Domain 0 crashed: rebooting machine in 5 seconds. 


I suggest this should be a blocker for F10.

Hardware; Hp DC7700, this cpu:
[summer@potoroo Documents]$ cat /proc/cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 CPU          6300  @ 1.86GHz
stepping	: 6
cpu MHz		: 1861.997
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips	: 3723.99
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

x2.

I'm running 64-bit.
Comment 1 John Summerfield 2008-08-12 09:59:02 EDT
I suppose I should mention this:
title Fedora (2.6.26-0.1.rc6.git2.fc10.x86_64.xen)
        root (hd0,4)
        kernel /xen.gz-2.6.26-0.1.rc6.git2.fc10.x86_64.xen com1=115200,8n1 conso
le=com1 `xencons=ttyS sync_console 
#       kernel /xen.gz-2.6.26-0.1.rc6.git2.fc10.x86_64.xen vga=text-80x60 com1=1
15200,8n1
        module /vmlinuz-2.6.26-0.1.rc6.git2.fc10.x86_64.xen ro root=/dev/VolGrou
p00/LogVol00 vga=794 video=vesafb selinux=0 console=ttyS0,115200n8 console=tty
        savedefault
        module /initrd-2.6.26-0.1.rc6.git2.fc10.x86_64.xen.img

so you know I'm using the latest (atm) version.
Comment 2 Mark McLoughlin 2008-08-12 10:47:24 EDT

*** This bug has been marked as a duplicate of bug 437930 ***
Comment 3 John Summerfield 2008-08-13 03:24:03 EDT
I am reopening this as I don't see that it's related to 437930.

This is xen, not the Linux kernel. It's not getting to the kernel.

This is F9 no more. 437930 is supposed to be fixed RSN and before F10.
Comment 4 Daniel Berrange 2008-08-13 04:39:20 EDT
Whether its crashing in the HV or Dom0 is kinda irrelevant. We have no Dom0 kernel in Fedora with which to debug / diagnose this problem. If you can identify an upstream changeset which fixes it by all means let us know & we'll patch the hypervisor with it, but aside fro mthat we're not putting any  work into BZ reports against the HV / Dom0 until we have a working pv_ops DOm0 kernel present in Fedora again.
Comment 5 Mark McLoughlin 2008-08-13 04:45:47 EDT
(In reply to comment #3)
> I am reopening this as I don't see that it's related to 437930.
> 
> This is xen, not the Linux kernel. It's not getting to the kernel.

What makes you think that?

(XEN) traps.c:1838:d0 Attempt to change unmodifiable CR0 flags.
(XEN) traps.c:413:d0 Unhandled general protection fault fault/trap [#13] on VCPU
0 [ec=0000]

That suggests the linux Dom0 guest is trying to illegally modify CR0 and crashing.

Until recently it used to get to "Xen is relinquishing VGA console" before crashing so something has changed here, but since we're know it's going to crash at *some* point, it's not worth debugging further.

> This is F9 no more. 437930 is supposed to be fixed RSN and before F10.

No progress has been made upstream, so we're not going to have Dom0 support in F10. See e.g.:

  https://www.redhat.com/archives/fedora-xen/2008-July/msg00048.html

*** This bug has been marked as a duplicate of bug 437930 ***
Comment 6 Mark McLoughlin 2008-08-13 04:47:57 EDT
(In reply to comment #5)
> 
> (XEN) traps.c:1838:d0 Attempt to change unmodifiable CR0 flags.
> (XEN) traps.c:413:d0 Unhandled general protection fault fault/trap [#13] on
> VCPU
> 0 [ec=0000]
> 
> That suggests the linux Dom0 guest is trying to illegally modify CR0 and
> crashing.
> 
> Until recently it used to get to "Xen is relinquishing VGA console" before
> crashing so something has changed here, but since we're know it's going to
> crash at *some* point, it's not worth debugging further.

Oh, and incidentally - it's not a recent HV change, it's a recent kernel change. I tried an older HV and Dom0 crashes at the same point
Comment 7 John Summerfield 2008-08-13 21:45:46 EDT
(In reply to comment #5)
> (In reply to comment #3)
> > I am reopening this as I don't see that it's related to 437930.
> > 
> > This is xen, not the Linux kernel. It's not getting to the kernel.
> 
> What makes you think that?
> 
> (XEN) traps.c:1838:d0 Attempt to change unmodifiable CR0 flags.
> (XEN) traps.c:413:d0 Unhandled general protection fault fault/trap [#13] on
> VCPU
> 0 [ec=0000]
> 
> That suggests the linux Dom0 guest is trying to illegally modify CR0 and
> crashing.
> 
> Until recently it used to get to "Xen is relinquishing VGA console" before
> crashing so something has changed here, but since we're know it's going to
> crash at *some* point, it's not worth debugging further.
> 
> > This is F9 no more. 437930 is supposed to be fixed RSN and before F10.
> 
> No progress has been made upstream, so we're not going to have Dom0 support in
> F10. See e.g.:
> 
>   https://www.redhat.com/archives/fedora-xen/2008-July/msg00048.html
> 
> *** This bug has been marked as a duplicate of 437930 ***


My interpretation is that XEN is trying to modify CR0. However, I don't know the code (and don't really want to). If fixing the other does not fix this, then _this_ report is likely to be overlooked.

Certainly, there's not much information in the other duplicates to say they are the same as this.
Comment 8 linux.ninja1 2008-11-10 15:38:50 EST
same behaviour in RHEL 5.3 beta.

RHEL5.3 non Xen kernel now works on dc7700 (open since dec 2006), but
the RHEL 5.3 / XEN option still crashes the system.

Badly designed HP Enterprise desktop hardware that gets you every time.

see:
https://bugzilla.redhat.com/show_bug.cgi?id=218884
Comment 9 Bug Zapper 2008-11-25 21:45:08 EST
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 10 Bug Zapper 2009-11-18 02:53:53 EST
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '10'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 10's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 10 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 11 Bug Zapper 2009-12-18 01:18:02 EST
Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.