Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
This bug is a performance issue relating to mmap of the same file from multiple nodes at once. The issue affects only the initial call to mmap and not subsequent page faults, so this will only be noticeable in cases where mmap is called frequently from multiple nodes on the same file. This occurs when running BLAST, for example.
There is a workaround, which is to alter the application to always use O_NOATIME when opening the files to be mapped. This is only possible if the opening process is the file owner or is root. After applying this patch, the workaround is no longer required.
The patch changes the tests applied at mmap time such that for noatime mounts, a glock will not be taken at all. For atime mounts, only a shared glock will be taken at mmap time, although if an atime update is required, an exclusive glock will still be required at a later time to write back the new atime.
DescriptionSteve Whitehouse
2011-02-01 09:53:43 UTC
+++ This bug was initially created as a clone of Bug #672724 +++
Created attachment 475319[details]
Program to demonstrate the problem.
Description of problem: When an application uses mmap to map in a file in a gfs2 filesystem in a read-only mode, it acquires an exclusive glock, even with noatime set on the filesystem. This has a significant impact on the performance of subsequent invocations of the application if the same file is accessed on multiple nodes.
Version-Release number of selected component (if applicable): 2.6.18-238.el5
How reproducible: Always
Steps to Reproduce:
1. Start with a file on a gfs2 filesystem that has no cached glocks
2. Run the attached application to map that file in read only
3.
Actual results: An exclusive glock will be created
Expected results: Only shared locks should be created
--- Additional comment from swhiteho on 2011-01-31 12:45:19 EST ---
I've tracked down what is going on here....
It is all down to the test used in the ->mmap() function which is supposed to skip the EX lock if there are no atime updates to be performed. The reason that the EX lock is being taken, is that there are a number of different ways in which the noatime state can be set: via the mount flags, via the O_NOATIME file flag and via the S_NOATIME flag (set on a per file basis via setattr)
The code checks only for O_NOATIME (which if set does prevent grabbing the EX lock) but the check is repeated later on in the VFS atime code, so that the actual atime updates are done correctly. Its only the locking that isn't quite correct.
So if you have access to the source code, there is a temporary workaround of opening the files to be mmaped with O_NOATIME. Note that this only happens on mmap() and not on page faults, so if the files are mmap()ed just once and then used many times, only the initial mmap call will require an EX lock. After that point all the locks will be PR (for read-only access, even if the file is mapped read/write).
That should allow you to get on with your BLAST runs. I'll try and get a patch sorted out for this as soon as I can.
--- Additional comment from scooter.edu on 2011-01-31 13:48:28 EST ---
Steve,
Excellent news!! We'll change BLAST right away and let you know the impact. Since the loader uses mmap() quite heavily, we are still interested in a patched kernel. This explains some symptoms that we had early on that we weren't able to explain (so we worked around them).
--- Additional comment from scooter.edu on 2011-01-31 17:05:17 EST ---
Steve. It turns out the O_NOATIME can only be used if you are the file owner or root, which is not a good solution for shared databases :-( We'll go ahead and get the timings to make sure that this works as expected, though.
Comment 1RHEL Program Management
2011-02-01 10:08:37 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.
Comment 4RHEL Program Management
2011-02-01 19:13:28 UTC
This request was erroneously denied for the current release of
Red Hat Enterprise Linux. The error has been fixed and this
request has been re-proposed for the current release.
Comment 5RHEL Program Management
2011-02-01 19:30:45 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update release.
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
New Contents:
This bug is a performance issue relating to mmap of the same file from multiple nodes at once. The issue affects only the initial call to mmap and not subsequent page faults, so this will only be noticeable in cases where mmap is called frequently from multiple nodes on the same file. This occurs when running BLAST, for example.
There is a workaround, which is to alter the application to always use O_NOATIME when opening the files to be mapped. This is only possible if the opening process is the file owner or is root. After applying this patch, the workaround is no longer required.
The patch changes the tests applied at mmap time such that for noatime mounts, a glock will not be taken at all. For atime mounts, only a shared glock will be taken at mmap time, although if an atime update is required, an exclusive glock will still be required at a later time to write back the new atime.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.
http://rhn.redhat.com/errata/RHSA-2011-0542.html