Bug 428612

Summary: RHEL 5.1 regression in hugepages due to pagetable sharing patch
Product: Red Hat Enterprise Linux 5 Reporter: John Sobecki <john.sobecki>
Component: kernelAssignee: Red Hat Kernel Manager <kernel-mgr>
Status: CLOSED ERRATA QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: urgent    
Version: 5.1CC: ddomingo, dzickus, jwest, lwang, lwoodman, sfolkwil
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: GSSApproved ResovleBy=02/28/2008
Fixed In Version: RHBA-2008-0314 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-21 15:06:18 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 391221, 431522    
Attachments:
Description Flags
hugetlb page leak patch none

Description John Sobecki 2008-01-14 06:45:35 UTC
Description of problem:

We (Oracle) have had several customers upgrade to RHEL5.1 or Oracle EL 5.1
and report that hugepages on x86_64 are not properly freed after the database
SGA/instance is shutdown. 


Version-Release number of selected component (if applicable):
2.6.18-53.el5

How reproducible:  
100% everytime but you need an SGA > 4GB. 

Steps to Reproduce:
1.  Create an SGA > 4Gb and have hugepages allocated for it
2.  Shutdown DB and all allocated hugepages will not be freed
3.  Despite ipcs -a showing clean and ps -ef showing no orcl processes,
      some of the SGA pages remain allocated
  
Actual results:
As above.

Expected results:
All allocated hugepages should be released upon DB shutdown.

Additional info:

Oracle had build a kernel without patch:
linux-2.6-mm-shared-page-table-for-hugetlb-page.patch

This was added per changelog entry:
- [mm] shared page table for hugetlb  page (Larry Woodman ) [222753] 

Without this patch, hugepages are properly released on DB shutdown.

Comment 1 Jeremy West 2008-01-14 14:45:12 UTC
What is the implication of this issue?  Are there any workarounds to this?

Thanks
Jeremy West

Comment 2 John Sobecki 2008-01-14 15:25:54 UTC
Implications:

Example from a customer.  SGA is 90G out of 128G.  On first DB startup, the SGA
is allocated from hugepages, works as expected.  DB is then shutdown, not all
hugepages are released, subsequent startups are force to try and allocate
regular pages, swapping results.  System is non-functional.  

Workaround:

Only workaround to release the hugepages at this time is a reboot.


Comment 4 Larry Woodman 2008-01-16 15:07:19 UTC
Can someone try this kernel ASAP:

>>>http://people.redhat.com/~lwoodman/RHEL5/


Larry


Comment 5 John Sobecki 2008-01-16 17:08:36 UTC
Hi Larry,

Machine won't boot with this kernel if I have vm.nr_hugepages set, or if I try
to set it manually after booting.  Did you want me to setup kexec and capture a
dump?  Thanks, John

Comment 6 Larry Woodman 2008-01-17 12:02:38 UTC
OK, I fixed that can you re-test?

>>>http://people.redhat.com/~lwoodman/RHEL5/

Comment 7 John Sobecki 2008-01-17 18:09:01 UTC
Hi,

I build an oracle Enterprise Linux kernel using Adam Litke's patch (attached
and backported to OEL 5.1) and confirmed Adam's fix. 

Thanks, John

Comment 8 John Sobecki 2008-01-17 18:09:59 UTC
Created attachment 292038 [details]
hugetlb  page leak patch

Comment 13 Don Zickus 2008-01-22 18:52:15 UTC
in 2.6.18-72.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 15 Mike Gahagan 2008-01-30 20:06:21 UTC
verified with the -72, -75 kernels while running other hugepage verification tests. 

Comment 20 Don Domingo 2008-02-06 02:46:27 UTC
adding to RHEl5.2 release notes under "Kernel-Related Updates":

<quote>
When shutting down a database, all allocated hugepages are now released upon
shutdown.
</quote>

please advise if any further revisions are required. thanks!

Comment 21 Don Domingo 2008-04-02 02:17:26 UTC
Hi,
the RHEL5.2 release notes will be dropped to translation on April 15, 2008, at
which point no further additions or revisions will be entertained.

a mockup of the RHEL5.2 release notes can be viewed at the following link:
http://intranet.corp.redhat.com/ic/intranet/RHEL5u2relnotesmockup.html

please use the aforementioned link to verify if your bugzilla is already in the
release notes (if it needs to be). each item in the release notes contains a
link to its original bug; as such, you can search through the release notes by
bug number.

Cheers,
Don

Comment 23 errata-xmlrpc 2008-05-21 15:06:18 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0314.html