Bug 316371
Summary: | 32-bit PAE HV hardware limitation > 4GB memory | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Bhavna Sarathy <bnagendr> |
Component: | kernel-xen | Assignee: | Bhavna Sarathy <bnagendr> |
Status: | CLOSED ERRATA | QA Contact: | Martin Jenner <mjenner> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 5.1 | CC: | bburns, bstein, ddomingo, frank.arnold, poelstra, rdoty, thomas.woller, xen-maint |
Target Milestone: | --- | Keywords: | OtherQA |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | RHBA-2008-0314 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2008-05-21 14:57:00 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 222082, 247190, 253746, 391221 | ||
Attachments: |
Description
Bhavna Sarathy
2007-10-03 02:31:31 UTC
Release notes for RHEL5.1: RHEL 5.1 supports Rapid Virtualization Indexing (RVI) in both 64-bit, 32-bit and 32-bit PAE kernels. There is a hardware limitation for 32bit PAE hypervisor wherein RVI can only translate 32bit guest virtual addresses. If guest is running PAE kernel with >3840 MB memory, a wrong address translation will result, and can crash the guests. Users are suggested to use 64-bit kernel if they want to run guests with more than 4GB physical memory under RVI. End of release notes The patch that works in xen-unstable, and 3.1.1 is having trouble working in the RHEL5.1 Xen code base. Let's use Nested paging in the release notes as RVI is marketing jargon and technical folks may not have heard of it. Brian, does 5.1 contain the release notes? This is still being debugged and can be moved to R5.2. But the release notes is for 5.1. added to RHEL5.1 release notes updates: <quote> Rapid Virtualization Indexing (RVI) is now supported on 64-bit, 32-bit, and 32-bit PAE kernels. However, RVI can only translate 32-bit guest virtual addresses on the 32-bit PAE hypervisor. As such, if a guest is running a PAE kernel with more than 3840MB of RAM, a wrong address translation error will occur. This can crash the guest. It is recommended that you use the 64-bit kernel if you intend to run guests with more than 4GB of physical RAM under RVI. </quote> please advise if any further revisions are required. thanks! We definitely need to get the fix for this issue from Kier / Xen 3.2.0 into the 5.2 tree. Without this fix we can't enable NPT by default. If someone has the patch please attach it to this ticket. Created attachment 237931 [details]
patch that overcomes the 4GB guest PAE limitation
This patch works on HV versions 3.0.4 and above. This patch doesn't work very
well with the 3.1 RHEL5.1 code base, not sure just yet if the 3.1.1 upgrade
made a difference. Comments regarding the patch are very welcome.
The patch basically removed e820 map entry beyond 4GB space and guests will not see >4GB physical space. Is this acceptable and does this work with 3.1.1? Hm. Unfortunately this makes life pretty poor for users; either they have to choose HAP, and have all of their guests are truncated to 4GB, or they have to not use HAP to get larger guests. If it's a bug in the silicon, then there is nothing we can do about it; however, it would be nice if we could make life a little bit better for users. At one point I asked Tom if it was possible to make HAP a per-guest configuration option; he seemed to think it was possible, but didn't really have a use. This problem with the silicon seems to argue for that use; if you make HAP a per-guest option, then you can enable HAP by default, and then let the users choose on a guest-by-guest basis whether they want > 4GB or HAP in that guest. Also, we shouldn't silently truncate the guest map; at the very least we should have some sort of printk() saying as much. Chris Lalancette Created attachment 239131 [details]
PAE sanitize e820 below 4GB
The attached file is the patch for fixing >4GB issue under Xen. It is directly applicable on RHEL 5.1 tree Snapshot 52. The following testing has been done: 1. HAP ON - bigsmp linux kernel with 5000MB memory ** boot well; guest saw ~3800MB physical memory (/proc/meminfo) - bigsmp linux kernel with 3500MB memory ** boot well; guest saw 3500MB physical memory - WinXP with 5000MB physical memory ** boot well; windows saw ~3800MB physical memory - WinXP with 3000MB physical memory ** boot well; windows XP saw 3000MB physical memory 2. HAP OFF - bigsmp linux kernel with 5000MB memory ** boot well; guest saw 5000MB physical memory - bigsmp linux kernel with 3500MB memory ** boot well; guest saw 3500MB physical memory - WinXP with 5000MB physical memory ** boot well; windows saw ~3800MB physical memory - WinXP with 3000MB physical memory ** boot well; windows XP saw 3000MB physical memory In summary, it works well as expected. By far, removing entries in e820 is the best solution we can think of for a 3.1 HV. Another possibility is to disable PAE bit for guest CPUID, but this approach had issues with bigsmp Linux guest. (In reply to comment #8) > Hm. Unfortunately this makes life pretty poor for users; either they have to > choose HAP, and have all of their guests are truncated to 4GB, or they have to > not use HAP to get larger guests. If it's a bug in the silicon, then there is > nothing we can do about it; however, it would be nice if we could make life a > little bit better for users. At one point I asked Tom if it was possible to > make HAP a per-guest configuration option; he seemed to think it was possible, > but didn't really have a use. This problem with the silicon seems to argue for > that use; if you make HAP a per-guest option, then you can enable HAP by > default, and then let the users choose on a guest-by-guest basis whether they > want > 4GB or HAP in that guest. Also, we shouldn't silently truncate the guest > map; at the very least we should have some sort of printk() saying as much. > > Chris Lalancette NP is not per guest, we thought of it in the very beginning and suggested it to Keir at XS. But the per guest idea was killed by Keir. He indicated that NP would be enabled by default (which is the case right now) since it had many advantages over shadow paging. So he felt there was no need for a per-guest configuration. Should we also take the approach that the system will automatically change the guest memory allocation down to 4GB and notify the user that this has been done? (In reply to comment #11) > NP is not per guest, we thought of it in the very beginning and suggested it to > Keir at XS. But the per guest idea was killed by Keir. He indicated that NP > would be enabled by default (which is the case right now) since it had many > advantages over shadow paging. So he felt there was no need for a per-guest > configuration. Understood, but at the time, Keir probably wasn't aware of this particular limitation. I think it makes sense to: 1) Try to do per-guest HAP upstream. That is, default to whatever is specified on the HV command-line, but let guests override it. That way customers can do per-guest either HAP or 4GB on 32-bit. 2) If upstream won't accept per-guest HAP, then fail domain creation with > 4GB, with an error message for the user. That way we won't get support calls saying "I gave my guest 6GB, but it only came up with 4GB". Chris Lalancette Keir should be aware of the silicon limitation since customers can use 64-bit hypervisor to run their >4GB PAE guests it's not as big a deal. Feel free to talk to him to see if he is aware of the limitation and if he has changed him mind. I posted the email to virtuallist as you suggested, unfortunately it's showing up mangled. Are you representing the Red Hat virt team view? I want to sum up all the discussions both internal and external. Keir Fraser has said that NP is enabled by default in 64-bit and 32-bit HVs but hesitates to enable NP in 32-bit PAE as it reduces functionality. AMD has decided that since this is design limitation with 32-bit PAE and customers would naturally want to use > 4GB memory with PAE, NP can be disabled by default. Customers will have the option of enabling it. We have submitted patches to xen-unstable that do not create >4GB guests which Bill, Chris have had a chance to review already. Keir did not want to add a descriptive printk such as "Guest creation failed while using hardware assisted paging, please ensure your guest physical memory is below 4GB, or switch over to 64-bit HV". Red Hat will have to add the verbose printk and carry the patch. Patches in xen-unstable: http://xenbits.xensource.com/xen-unstable.hg?rev/c7d5d229f191 http://xenbits.xensource.com/xen-unstable.hg?rev/2717128cbdd1 Bill, since you are working on rebasing to the bugfix release 3.1.2, could you incorporate these patches as well? This is build fix that Keir put in http://xenbits.xensource.com/xen-unstable.hg?rev/e2d76fb12ae2 Please evaluate if this is a xen-unstable build fix or we will need it in R5.2. I agree with Kier that adding a hypervisor printk is not much use, as this is invisible to the end user. Is there a way for us to determine from userspace that HAP is enabled ? If so, then we should add a check in XenD for HAP and > 4 GB PAE guest. This would allow XenD to return an nice error message directly to the application, which would immediately be seen by the user. Dan, Agreed that a user-space print is the absolute best way to go, if there is a way to do it. If not, a hypervisor printk can be picked up by "xm dmesg", which at the very least allows support to review the logs and make recommendations without bothering Engineering. Chris Lalancette Bill will incorporate the patches into the rebase or submit as discussed. Created attachment 278811 [details]
Fail attempts to add pages to guest pseudophys memory map above
4GB when running with AMD NPT on PAE host
Changeset 16279 has the fix that we would want in R5.2. Nested Paging will be
enabled by default for 64-bit. If users choose to use NP as the default then
we want this patch so prevent a guest crash.
in 2.6.18-73.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 added to RHEl5.2 release notes under "Resolved Issues": <quote> A wrong address translation (which can lead to a crashed guest) no longer occurs if a guest is running a PAE kernel with more than 3,840MB of RAM. As such, you no longer need to use the 64-bit kernel if you intend to run guests with more than 4GB of physical RAM under Rapid Virtualization Indexing (RVI). </quote> please advise if any revisions are required. thanks! one note. unstable and 3.2.1 base now has "per domain HAP support", (although we are having issues with >4Gig guests for shadow paging). we are NOT asking for a backport, :) just fyi. Greetings Red Hat Partner, A fix for this issue should be included in the latest packages contained in RHEL5.2-Snapshot1--available now on partners.redhat.com. Please test and confirm that your issue is fixed. After you (Red Hat Partner) have verified that this issue has been addressed, please perform the following: 1) Change the *status* of this bug to VERIFIED. 2) Add *keyword* of PartnerVerified (leaving the existing keywords unmodified) If this issue is not fixed, please add a comment describing the most recent symptoms of the problem you are having and change the status of the bug to ASSIGNED. If you are receiving this message in Issue Tracker, please reply with a message to Issue Tracker about your results and I will update bugzilla for you. If you need assistance accessing ftp://partners.redhat.com, please contact your Partner Manager. Thank you Hi, the RHEL5.2 release notes will be dropped to translation on April 15, 2008, at which point no further additions or revisions will be entertained. a mockup of the RHEL5.2 release notes can be viewed at the following link: http://intranet.corp.redhat.com/ic/intranet/RHEL5u2relnotesmockup.html please use the aforementioned link to verify if your bugzilla is already in the release notes (if it needs to be). each item in the release notes contains a link to its original bug; as such, you can search through the release notes by bug number. Cheers, Don Greetings Red Hat Partner, A fix for this issue should be included in the latest packages contained in RHEL5.2-Snapshot3--available now on partners.redhat.com. Please test and confirm that your issue is fixed. After you (Red Hat Partner) have verified that this issue has been addressed, please perform the following: 1) Change the *status* of this bug to VERIFIED. 2) Add *keyword* of PartnerVerified (leaving the existing keywords unmodified) If this issue is not fixed, please add a comment describing the most recent symptoms of the problem you are having and change the status of the bug to ASSIGNED. If you are receiving this message in Issue Tracker, please reply with a message to Issue Tracker about your results and I will update bugzilla for you. If you need assistance accessing ftp://partners.redhat.com, please contact your Partner Manager. Thank you Greetings Red Hat Partner, A fix for this issue should be included in the latest packages contained in RHEL5.2-Snapshot4--available now on partners.redhat.com. Please test and confirm that your issue is fixed. After you (Red Hat Partner) have verified that this issue has been addressed, please perform the following: 1) Change the *status* of this bug to VERIFIED. 2) Add *keyword* of PartnerVerified (leaving the existing keywords unmodified) If this issue is not fixed, please add a comment describing the most recent symptoms of the problem you are having and change the status of the bug to ASSIGNED. If you are receiving this message in Issue Tracker, please reply with a message to Issue Tracker about your results and I will update bugzilla for you. If you need assistance accessing ftp://partners.redhat.com, please contact your Partner Manager. Thank you Trying to start a guest with hap enabled and giving it more than 4GB of memory fails, as expected. xm create fails and prints... Error: (1, 'Internal error', 'Could not allocate memory for HVM guest.\n (16 = Device or resource busy)') xm dmesg log states... (XEN) p2m.c:675: Dom1 failed to populate memory beyond 4GB: remove \047hap\047 Xen boot parameter. as per Comment#29, please advise if we need to retract "Issue Resolved" release note for this bug (quoted in Comment#22). Deadline for RHEL5.2 release notes is close of business hours today. thanks! Please retract, no need for release notes. thanks Bhavana. reinstating old "Known Issue" text, appearing in RHEL5.2 under Feature Updates => Virtualization => Known Issues: <quote> Rapid Virtualization Indexing (RVI) is supported on 64-bit, 32-bit, and 32-bit PAE kernels. However, RVI can only translate 32-bit guest virtual addresses on the 32-bit PAE hypervisor. As such, if a guest is running a PAE kernel with more than 3840MB of RAM, a wrong address translation error will occur. This can crash the guest. It is recommended that you use the 64-bit kernel if you intend to run guests with more than 4GB of physical RAM under RVI. </quote> please advise if any further revisions are required. thanks! Since we have sorted out the 4GB PAE issue and fixed the crash that's mentioned above, this should not be under "Known Issues". The 4GB PAE is a hardware limitation and is not a bug. If you must add release notes, then this would be more appropriate. <quote> Rapid Virtualization Indexing (RVI) is supported on 64-bit, 32-bit, and 32-bit PAE kernels. However, RVI can only translate 32-bit guest virtual addresses on the 32-bit PAE hypervisor. As such, if a guest is running a PAE kernel with more than 3840MB of RAM, the host will print out an error message "Dom x failed to populate memory beyond 4GB: remove hap Xen boot parameter." It is recommended that you use the 64-bit kernel if you intend to run guests with more than 4GB of physical RAM under RVI. </quote> thanks Bhavana, that clears it up for me. as such, i'm removing this from the release notes; i intend to push this to a kbase instead. clearing all RHEL5.2 relnotes flags. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0314.html |