Bug 990708

Summary: Application fails in do_mmap_pgoff
Product: Red Hat Enterprise Linux 5 Reporter: Matthew Whitehead <mwhitehe>
Component: kernelAssignee: Larry Woodman <lwoodman>
Status: CLOSED NOTABUG QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 5.5CC: aquini, cww, dgibson, fleite, hhuang, jweiner, lwang, riel
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-08-13 03:15:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Matthew Whitehead 2013-07-31 19:46:37 UTC
Description of problem: When calling mmap64, a subroutine call to do_mmap_pgoff() appears to fail. A semaphore is also held and never released because of this failure, causing the application to hang.

Engineering suspects a failed call to make_pages_present() on line 1157 of mm/mmap.c may be a cause, but this requires more investigation.

The customer is running application with realtime priority 1 (lowest of the realtime priorities) using the FIFO scheduler. 32 cores on the host, 30 cores are used for the application.

They use the kernel parameter isolcpus=1-7,9-31 which is uncommon.


Version-Release number of selected component (if applicable): RHEL5.5 kernel 2.6.18-238.21.1.el5.


How reproducible: Four instances of this are believed to have happened to the customer.

Steps to Reproduce: Unknown.
1.
2.
3.

Actual results: 


Expected results: do_mmap_pgoff() should be successful.


Additional info:

Crash dump vmcore.00914556 uploading to dropbox.

Comment 1 Matthew Whitehead 2013-07-31 19:56:28 UTC
Correction: the line is really isolcpus=1,3-31 .

Comment 2 Matthew Whitehead 2013-08-01 13:53:51 UTC
Making bug public.

Comment 9 David Gibson 2013-08-13 03:15:16 UTC
Since we've identified this as an application / library interaction problem, I'm closing this as NOTABUG.

We can re-open if the customer finds a real RH bug in their further investigation.