Bug 467698
| Summary: | xen: 32 bit guest on 64 bit host oops in xen_set_pud() | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Chris Lalancette <clalance> | ||||
| Component: | kernel-xen | Assignee: | Chris Lalancette <clalance> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Martin Jenner <mjenner> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 5.4 | CC: | awood, berrange, dzickus, jeremy, markmc, orion, pasik, xen-maint | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2009-09-02 08:40:54 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 457879 | ||||||
| Bug Blocks: | 718066 | ||||||
| Attachments: |
|
||||||
|
Description
Chris Lalancette
2008-10-20 11:02:22 UTC
Additional notes: this is a problem in the RHEL-5 hypervisor as well, when trying to install a F10 i386 PV guest on an x86_64 RHEL-5 HV. As Jeremy pointed out, the upstream xen-unstable c/s was 17061, and the upstream xen-3.1-testing.hg c/s was 15653. I'll attach a backport to the BZ, which seems to fix the problem for me. Chris Lalancette Created attachment 320860 [details]
Backport of upstream xen-3.1-testing c/s 15653, to fix F-10 32-on-64 crash
Is there a public version of a working xen for 5.2? The ones here: http://fedorapeople.org/~crobinso/rhel5/install_f10/ don't work for me. Hm, I'm not sure if you posted in the right bug, but those packages you mentioned are the preview packages for 5.3. So if they don't work, please let us know why. Chris Lalancette Here are the messages. I post here because of xen_set_pud. Happy to open new bug if needed.
Checking if this processor honours the WP bit even in supervisor mode...Ok.
1 multicall(s) failed: cpu 0
Pid: 0, comm: swapper Not tainted 2.6.27.4-58.fc10.i686.PAE #1
[<c06d1213>] ? printk+0xf/0x14
[<c04049d7>] xen_mc_flush+0xbb/0x187
[<c0405332>] xen_mc_issue+0x14/0x48
[<c04058d2>] xen_set_pud_hyper+0x39/0x41
[<c040590e>] xen_set_pud+0x34/0x39
[<c041f9a0>] zap_low_mappings+0x2f/0x47
[<c08609a6>] mem_init+0x2c7/0x2cf
[<c084b7e9>] start_kernel+0x246/0x2f0
[<c084b091>] i386_start_kernel+0x80/0x88
[<c08511e2>] xen_start_kernel+0x7dd/0x7e5
=======================
call 1/1: op=1 arg=[c2b96854] result=-22
------------[ cut here ]------------
kernel BUG at arch/x86/xen/multicalls.c:104!
invalid opcode: 0000 [#1] SMP
Modules linked in:
Pid: 0, comm: swapper Not tainted (2.6.27.4-58.fc10.i686.PAE #1)
EIP: e019:[<c0404a97>] EFLAGS: 00010002 CPU: 0
EIP is at xen_mc_flush+0x17b/0x187
EAX: c2b96054 EBX: 00000000 ECX: ffffffff EDX: c2b96054
ESI: 00000001 EDI: 00000001 EBP: c0846ef4 ESP: c0846ee0
DS: e021 ES: e021 FS: 00d8 GS: 0000 SS: e021
Process swapper (pid: 0, ti=c0846000 task=c0808344 task.ti=c0846000)
Stack: c2b96054 00000000 00000001 7373d001 00000000 c0846f00 c0405332 c0833000
c0846f24 c04058d2 737b4000 00000000 7373d001 00000000 c0833000 7373d001
00000000 c0846f38 c040590e c0833000 c0834000 00000000 c0846f50 c041f9a0
Call Trace:
[<c0405332>] ? xen_mc_issue+0x14/0x48
[<c04058d2>] ? xen_set_pud_hyper+0x39/0x41
[<c040590e>] ? xen_set_pud+0x34/0x39
[<c041f9a0>] ? zap_low_mappings+0x2f/0x47
[<c08609a6>] ? mem_init+0x2c7/0x2cf
[<c084b7e9>] ? start_kernel+0x246/0x2f0
[<c084b091>] ? i386_start_kernel+0x80/0x88
[<c08511e2>] ? xen_start_kernel+0x7dd/0x7e5
=======================
Code: 8b 55 ec 8b 84 da 04 0a 00 00 ff 94 da 00 0a 00 00 43 8b 45 ec 3b 98 08 0b 00 00 72 e3 85 ff c7 80 08 0b 00 00 00 00 00 00 74 04 <0f> 0b eb fe 8d 65 f4 5b 5e 5f 5d c3 55 89 e5 57 89 d7 56 89 c6
EIP: [<c0404a97>] xen_mc_flush+0x17b/0x187 SS:ESP e021:c0846ee0
---[ end trace 4eaa2a86a8e2da22 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
Oh, I see. Well, there are two problems: 1. Those packages are only the userspace portion, while this ends up being a hypervisor bug. The hypervisor is packaged into the kernel, so you would need updated kernel-xen packages. 2. Regardless, this patch isn't in the latest kernel-xen packages. It still needs to go through internal review and testing first. Thanks for the testing, though. Chris Lalancette *** Bug 471276 has been marked as a duplicate of this bug. *** I've uploaded a test kernel that contains this fix (along with several others) to this location: http://people.redhat.com/clalance/virttest Could the original reporter try out the test kernels there, and report back if it fixes the problem? Thanks, Chris Lalancette # rpm -ivh kernel-xen-2.6.18-128.el5virttest3.x86_64.rpm
error: Failed dependencies:
ecryptfs-utils < 44 conflicts with kernel-xen-2.6.18-128.el5virttest3.x86_64
# rpm -q ecryptfs-utils
ecryptfs-utils-41-1.el5
Sigh. Can you temporarily just remove ecryptfs-utils (assuming you aren't using encrypted partitions)? The newer ecryptfs-utils will be shipped as part of 5.3, but hasn't been released yet. Chris Lalancette Okay, removed ecryptfs-utils, didn't quite realize it was optional. Looking good for me, I'm able to start a 32-bit fedora rawhide install, which wasn't even able to boot before. Also able to install 32-bit fedora 10 guest. Yeah, ecryptfs-utils is optional unless you are using encrypted partitions, in which case it is mandatory. But I guess you are not doing that :). In any case, that is great news; it also seemed to fix the problem in my testing. I'll get this ready to go into the next RHEL release. Thanks for the testing, Chris Lalancette I am starting to see the following: xen_net: Memory squeeze in netback driver. and networking stop working in the guests. This may not be related to this new kernel, just that I am overloading the machine now (I am adding new guests), but thought I'd mention here before filing a new issue if necessary. OK, yeah. There's another open bug about this (BZ 454285); one of the patches in this kernel seems to be exacerbating the problem, though, since I also saw it on one of my loaded machines. It needs to be debugged further. Chris Lalancette in kernel-2.6.18-140.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified. I was seeing this bug on CentOS 5.3 x86_64 dom0; I could not start i386 F10 or F11 installation using virt-install. The graphical VNC console would never show up. When running "xm console <dom>" I saw a domU kernel crash. After uprading the x86_64 dom0 kernel+xen to -159.el5 the problem is fixed. I can now successfully install i386 Fedora 10 and Fedora 11 guests/domUs. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1243.html |