Bug 844789

Summary: Upgrading/installing glibc-common in a fedora16 image on a Xen 4.x HVM w/PV drivers causes the VM host machine to kernel panic (and reboot)
Product: [Fedora] Fedora Reporter: corasian
Component: xenAssignee: Michael Young <m.a.young>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 16CC: crobinso, jforbes, kraxel, m.a.young, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-11 16:43:28 EST Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description corasian 2012-07-31 14:49:33 EDT
Description of problem:
when running yum install gcc on a Fedora 16 VM (in a xen 4.x environment), a) the installation of glibc-2.14.90-24.fc16.7.i686 causes the host machine to kernel dump (with a cpu fault).

Version-Release number of selected component (if applicable):
xen 4.x
Fedora 16 (guest OS) HVM w/ PV drivers

How reproducible:
freshly installed/updated fedora 16 image on a xen 4.x HVM (tested with same results on 4.0, 4.1, 4.2), run yum install gcc.  

Steps to Reproduce:
1.Install fedora via PXE/cd image/direct image onto a Xen 4.x HVM
2.Run yum install gcc
3.Watch the Host machine reboot
Actual results:
Xen host reboots with the following stack trace output:
(XEN) sh error: shadow_unhook_mappings(): top-level shadow has bad type 00000001
(XEN) Xen BUG at common.c:1305
(XEN) ----[ Xen-4.1.2  x86_64  debug=n  Not tainted ]----
(XEN) CPU:    1
(XEN) RIP:    e008:[<ffff82c4801ce9f0>] shadow_unhook_mappings+0x70/0xa0
(XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
(XEN) rax: 0000000000000000   rbx: 0000000000000000   rcx: 0000000000000001
(XEN) rdx: 000000000000000a   rsi: 000000000000000a   rdi: ffff82c480237244
(XEN) rbp: ffff83007f83e000   rsp: ffff83043ff97bf8   r8:  0000000000000001
(XEN) r9:  0000000000000000   r10: 00000000ffffffff   r11: ffff82c480134e30
(XEN) r12: 0000000000000008   r13: 0000000000000001   r14: 0000000032401001
(XEN) r15: ffff82f600000000   cr0: 0000000080050033   cr4: 00000000000006f0
(XEN) cr3: 000000083ea7b000   cr2: 00000000b7728000
(XEN) ds: 0000   es: 0000   fs: 00d8   gs: 00e0   ss: 0000   cs: e008
(XEN) Xen stack trace from rsp=ffff83043ff97bf8:
(XEN)    0000000032401001 ffff82c4801e22bb ffff83043ff97c34 ffff83043ff97c3c
(XEN)    ffff83043ff97c38 0000000100000010 ffff83043ff97f18 000000073ff97cd8
(XEN)    00000001f65b5e88 0000000000000000 ffff83083ea58000 0000000000000009
(XEN)    00000000f65b5e88 ffff83007f83e000 0000000000000000 ffff82c4801a6a21
(XEN)    ffff83043ff97cb4 000000060000000c ffff83043ff97d48 ffff83007f83e000
(XEN)    0000000000443c02 ffff83043ff97f18 0000001100000004 00000007f65b5aa8
(XEN)    000000013ea58000 0000000000000000 00000000c0402445 ffff83043ff97f18
(XEN)    f65b5eb4f64a7ff0 0000000032401000 0000000000000000 ffff82c4801b387f
(XEN)    80000000362001e3 ffff83083ea58000 0000000000000081 ffff82c4801b39c2
(XEN)    0000000000000004 ffff83043ff97f18 0000000000000004 ffff83007f83e000
(XEN)    0000000000000022 ffff83007f83e000 0000000000000000 ffff82c4801a2bc8
(XEN)    ffffffff0c930068 0000000000000000 0000000000000081 0000000000000003
(XEN)    0000000000000081 ffff83043ff97f18 ffff83086bde0000 ffff82c4801b61b6
(XEN)    ffff82c4802b44c0 ffff82c4802b44c0 0000000000000001 0000000000000000
(XEN)    0000000000000002 0000000000000000 000000000000000f 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000d6aceade3726 ffff83083ea45f80
(XEN)    ffff83043ff97f18 ffff83083ea45f40 ffff83043fff2c90 ffff82c4802b36c0
(XEN)    0000000000000001 ffff82c480117248 ffff82c48012207d ffff83043ffb5248
(XEN)    0000000000000001 0000000000000296 0000d6acebe88835 ffff83043ff9e100
(XEN)    0000d6aceb4feb1d 0000000000000001 ffff82c480117070 ffff82c4801b2ff3
(XEN) Xen call trace:
(XEN)    [<ffff82c4801ce9f0>] shadow_unhook_mappings+0x70/0xa0
(XEN)    [<ffff82c4801e22bb>] sh_pagetable_dying+0x2bb/0x400
(XEN)    [<ffff82c4801a6a21>] do_hvm_op+0x451/0x1c60
(XEN)    [<ffff82c4801b387f>] fetch+0x2f/0xa0
(XEN)    [<ffff82c4801b39c2>] __get_instruction_length_from_list+0xd2/0x300
(XEN)    [<ffff82c4801a2bc8>] hvm_do_hypercall+0xe8/0x1f0
(XEN)    [<ffff82c4801b61b6>] svm_vmexit_handler+0x196/0x1740
(XEN)    [<ffff82c480117248>] csched_tick+0x1d8/0x280
(XEN)    [<ffff82c48012207d>] add_entry+0x4d/0xc0
(XEN)    [<ffff82c480117070>] csched_tick+0x0/0x280
(XEN)    [<ffff82c4801b2ff3>] pt_update_irq+0x33/0x1e0
(XEN)    [<ffff82c480122014>] execute_timer+0x54/0x70
(XEN)    [<ffff82c4801afa62>] vlapic_has_pending_irq+0x42/0x70
(XEN)    [<ffff82c4801aad06>] hvm_vcpu_has_pending_irq+0x76/0xd0
(XEN)    [<ffff82c4801b3ea1>] svm_intr_assist+0x41/0x140
(XEN)    [<ffff82c4801b3d1a>] svm_stgi_label+0x8/0x2e
(XEN) ****************************************
(XEN) Panic on CPU 1:
(XEN) Xen BUG at common.c:1305
(XEN) ****************************************
(XEN) Reboot in five seconds...

Expected results:
gcc (and dependencies) installed successfully, and previous version duplicates are effectively removed.

Additional info:
I posted a bug in the Yum queue for the what appears to be the yum-related components of this bug.
Comment 1 Michael Young 2012-07-31 15:31:19 EDT
If you are crashing xen then that is a fault in the base dom0 system, not the guest, as nothing the guest can do should crash the hosting system (most likely in your xen build or the dom0 kernel). However you don't say anything about the base system, so it isn't clear this is a Fedora problem at all.
Comment 2 corasian 2012-07-31 16:09:41 EDT
Roger, the dom0 system is centOS 5.6. However, the reason I think it may be a Fedora bug is that this behavior is not replicated when performing the same task in HVM's running CentOS or Scientific Linux. Also, this is replicated across different builds of xen ranging from xen4.0 to 4.2. Granted, I will try to test it out on a more updated install (OS) maybe even a fedora dom0 to see if it is replicated. Also, I have filed a bug with xen on this issue, just to cover the bases. That being said, you may have a point being that the fedora install in my setup is the only one running a 3.0+ kernel, as both centos and scilinux are not there yet. So this may be kernel level issue, as opposed to a specific distro. I'll test it out. I just find it odd that it happens only when installing/updating the package, and only with fedora, but once everything comes back up afterward, it works just as smoothly as ever (after cleaning up the rest of the transaction).
Comment 3 Michael Young 2012-07-31 18:24:53 EDT
(In reply to comment #2)
> Roger, the dom0 system is centOS 5.6. However, the reason I think it may be
> a Fedora bug is that this behavior is not replicated when performing the
> same task in HVM's running CentOS or Scientific Linux.

That doesn't make it a Fedora bug. You are just finding a xen or dom0 bug that is triggered by a Fedora guest but not RHEL clone guests, which might be due to the later kernel or something else in Fedora. I agree you should try to reproduce the bug with a later dom0 kernel as I doubt there would be much interest in a bug that only occurs with an old RHEL dom0 kernel. It might also be worth trying a CentOS 5.8 in case it is a bug that has been found and fixed (there are some svm/HVM fixes listed in the changelog, though they may be for the xen hypervisor shipped with their kernel).
Comment 4 Fedora End Of Life 2013-01-16 10:09:25 EST
This message is a reminder that Fedora 16 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 16. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '16'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 16's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 16 is end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" and open it against that version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
Comment 5 Cole Robinson 2013-02-11 16:43:28 EST
Closing NOTABUG per comment #3