Bug 234375
Summary: | PV guests can crash at boot w/ >4GB memory | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Stephen Tweedie <sct> |
Component: | kernel-xen | Assignee: | Chris Lalancette <clalance> |
Status: | CLOSED ERRATA | QA Contact: | Martin Jenner <mjenner> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 5.0 | CC: | clalance, ijc, xen-maint |
Target Milestone: | --- | Keywords: | OtherQA, Regression |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | RHBA-2007-0959 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2007-11-07 19:45:39 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Stephen Tweedie
2007-03-28 19:03:25 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. It turns out that there has been a hypervisor side workaround for this issue for a while now but it was broken in xen-unstable.hg between 13392:0fd65225e4c6 (17 Jan 2007) and 15433:a5360bf18668 (28 June 2007) The workaround is in xen/arch/x86/mm.c with the comment (line 3297 in current xen-unstable): /* * If this is an upper-half write to a PAE PTE then we assume that * the guest has simply got the two writes the wrong way round. We * zap the PRESENT bit on the assumption that the bottom half will * be written immediately after we return to the guest. */ I suspect that the RHEL5 hypervisor has the workaround but doesn't have the breakage, in which case this can be closed. change QA contact Crap. I finally was able to reproduce this, just not in the way specified originally. I have a 2 CPU Intel SDV here, running the RHEL 5.1 dom0 bits, i386. Then I have 1 RHEL-5.0 i386 PV guest running an FTP test that is saturating the networking. Finally, I have a 2nd RHEL-5.0 i386 PV guest just rebooting in a loop (init 6 in /etc/rc.local). Very often, that 2nd guest will fail to boot, with this in xm dmesg: (XEN) mm.c:3267:d6 ptwr_emulate: could not get_page_from_l1e() After applying c/s 15433 to the HV from xen-unstable, however, I now see this: (XEN) mm.c:3263:d14 ptwr_emulate: fixing up invalid PAE PTE 0000000149f12025 and the domain successfully reboots. I believe we are going to need that c/s for our HV, to support older 5.0 guests. Chris Lalancette in 2.6.18-37.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Fujitsu tested with 5.1 beta, and it worked fine. ---------------------------------- We tested this issue with kernel-xen-2.6.18-37.el5, the result is OK. We could boot 4 domains. This event sent from IssueTracker by mmatsuya issue 128803 An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0959.html |