Red Hat Bugzilla – Bug 248954
Oracle ASM DBWR process goes into 100% CPU spin when using hugepages on ia64
Last modified: 2013-08-05 21:43:38 EDT
Description of problem:
Using the stack ia64/oracle ASMLib/hugepages/Database 10.2, the ASM instance
will hang on startup and the ASM DBWR process will go into a 100% CPU
spin. Sysrq-P samples show the following stack:
Version-Release number of selected component (if applicable):
2.6.9-55.0.2.EL on ia64 only.
100% in house and 100% at customer.
Steps to Reproduce:
1. Install oracleasm RPMs
2. Install RDBMS 10.2
3. Allocate hugepages for ASM and database SGAs
4. Attempt to startup ASM instance using ASMlib discovery string of
5. ASM instance hangs on startup, DBWR process spinning using 100% CPU
Same as above.
Removed hugepages or set max locked mem ulimit to zero so no hugepages
can be allocated for the ASM instance is an effective workaround.
Created attachment 159608 [details]
tested patch that resolves the hugepages spin
Is it reproducible with upstream ?
Is the patch in comment#1 upstream?
1) Haven't tested on EL5 yet
2) Patch per discussion with Jens Axboe, this is not how mainline patched
(that is more generic to set_page_dirty_lock) but this seemed
less intrusive for the 2.6.9 kernel.
Red Hat customer on this issue is: IOWA COURT INFORMATION SYSTEMS
Since the mainline use diffient method to resolve same problem, could you also
point out the link to that upstream patch?
rechecked 2.6.9 bio.c, it is clear that there are several places checking
PageCompound flag. According to comments for bio_set_pages_dirty, Since VM
doesn't handle the dirtiness of compound pages, the patch should be right.
I will post it.
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
This request was previously evaluated by Red Hat Product Management
for inclusion in the current Red Hat Enterprise Linux release, but
Red Hat was unable to resolve it in time. This request will be
reviewed for a future Red Hat Enterprise Linux release.
The patch has been posted, change status to post.., if the re-post is necessary
for the future release, please just let me know.
committed in stream U7 build 68.4. A test kernel with this patch is available
Can you please test and post your results here? Thanks!
Can you please report your test results here?
Oracle, this **high** severity bug is now a possible candidate for exclusion in
RHEL4.7. If you wish for this bug to be fixed in 4.7, please report your test
results here as soon as possible. Thank you.
Can you please have your test results posted to this BZ?
Thanks and Regards,
An Oracle Enterprise Linux kernel with this patch successfully passed
And the customer was given a copy of this kernel, and has not reported any
problems for months. Thanks, John
Oracle, thanks for the test results. I would highly recommend testing the latest
RHEL Snapshot available on partners.redhat.com to confirm that the kernel
currently being shipped addresses your issues. It is of minor concern, however,
sometimes patches that were included in custom RPM builds (comment #12) get left
out or modified during the course of continued development. Please report any
problems you might encounter while testing RHEL4U7 Snapshot releases. Thanks!
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.
Partners, I would like to thank you all for your participation in assuring the
quality of this RHEL 4.7 Update Release. My hat's off to you all. Thanks.