Bug 855357

Summary: cat /proc/pid/num_maps seems to lock processes while generating it's data which is disruptive
Product: Red Hat Enterprise Linux 5 Reporter: Simon J Mudd <sjmudd>
Component: kernelAssignee: Red Hat Kernel Manager <kernel-mgr>
Status: CLOSED WONTFIX QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 5.6CC: jeremy, lwoodman, sjmudd
Target Milestone: rcFlags: pm-rhel: needinfo? (sjmudd)
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-02 09:22:31 EDT Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:

Description Simon J Mudd 2012-09-07 09:59:02 EDT
Description of problem:

We run database (mysqld) servers with quite a large amount of memory (192GB) and have been having problems with accessing /proc/pid/numa_maps interfering with the mysqld process that was being "monitored".

See: https://raw.github.com/jeremycole/blog-files/master/numa-maps-summary.pl
See: http://blog.jcole.us/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/

Version-Release number of selected component (if applicable):

[ Running CentOS but reporting upstream. ]

# cat /etc/redhat-release 
CentOS release 5.6 (Final)

Linux my-hostname 2.6.18-238.el5 #1 SMP Thu Jan 13 15:51:15 EST 2011 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:

Completely. Several severs were affected by this.

Sorry I don't have a better test case but this is what we see:

Steps to Reproduce:
1. Start mysqld with innodb_buffer_size = 160G (on a 192GB box)
2. Have a process doing continual connects (with a 1 second timeout) to the mysqld server
3. Run numa-maps-summary.pl < /proc/<mysqld_pid>/numa_maps  

Actual results:

Observe how this generates connect timeouts while the numa-maps script is running, and not while it isn't.

Expected results:

Ideally cat'ing the /proc/pid/numa_maps file should not block, and thus the pid of which the /proc/pid/numa_maps file refers to should not be blocked. While this may be expected kernel behaviour, reading the numa_maps information can be important for debugging memory usage and if this process is disruptive that prevents it being used.

Additional info:

This is not a mysqld bug as the issue only occurs when the proc file is being accessed. My guess is that doing the cat of the numa_maps file generates dynamically the required information and while doing so locks the relevant process.  This might not be noticed normally but with the short connect timeout this is quite disruptive.
Comment 1 Larry Woodman 2012-09-19 11:08:45 EDT
I dont think cat'ing the /proc/pid/numa_maps locks the process being inspected for long periods of time.  It does and must however take the mm->page_table_lock while its looking at the pages mapped into the vma for each region so it cant change underneith while its walking.  I dont have much of a reproducer to see this problem happening, can you come upp with something that stand-alone?  Also, RHEL6 looks pretty siomilar to the upstream kernell in this area, are you seeing this problem with the upstream kernel as well?

Larry Woodman
Comment 2 Simon J Mudd 2012-09-20 16:44:36 EDT

My guess is the problem is simply caused by the page walk of the ~170GB process taking longer than the 2-second connect timeout configured on the mysql client.
That said, considering that the mysql connects are normally accepted in ms the change is intrusive. I'd like to be able to read the numa maps memory layout in a way which does not have this side affect.

Let me see if I can reproduce this on a similarly configured CentOS 6.2 server.
Comment 3 Larry Woodman 2012-09-20 16:51:53 EDT
And at the same time I'll get a large/~256GB system, write a program that maps most of the memory and time a cat /proc/<pid>/numa_maps of that process and if the time is excessive evaluate where its hanging out.  I just dont know what can be done about it since there is locking requirements involved.

Comment 4 RHEL Product and Program Management 2014-03-07 08:54:39 EST
This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug.
Comment 5 RHEL Product and Program Management 2014-06-02 09:22:31 EDT
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).