Bug 480170

Summary: multipath test causes memory leak and eventual system deadlock
Product: Red Hat Enterprise Linux 5 Reporter: Tom Coughlan <coughlan>
Component: device-mapper-multipathAssignee: Ben Marzinski <bmarzins>
Status: CLOSED DUPLICATE QA Contact: Cluster QE <mspqa-list>
Severity: urgent Docs Contact:
Priority: high    
Version: 5.4CC: agk, andriusb, bdonahue, benl, bino.sebastian, bmarzins, bmr, christophe.varoqui, coughlan, dchapman, dwysocha, edamato, egoggin, heinzm, junichi.nomura, kueda, laurie.barry, lmb, mbroz, mchristi, phinchman, prockai, rick.hester, syeghiay, tranlan, vijayakumar
Target Milestone: rcKeywords: OtherQA, Regression
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 478643 Environment:
Last Closed: 2009-01-15 19:23:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 478643    
Bug Blocks:    

Comment 1 RHEL Program Management 2009-01-15 15:37:05 UTC
This bugzilla has Keywords: Regression.  

Since no regressions are allowed between releases, 
it is also being proposed as a blocker for this release.  

Please resolve ASAP.

Comment 2 Ben Marzinski 2009-01-15 17:18:43 UTC
There.  I changed do_pipe() to pipe() since I assume our customers don't care what internal kernel function is failing.  They just want to know what system call could cause them problems.  I also changed the instructions so that users know that they only need to change max_fds if it needs to be above 1024.  The vast majority of users don't need to do anything.

Comment 3 Ben Marzinski 2009-01-15 17:18:43 UTC
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1,6 +1,6 @@
 We need a 5.3 release note for this. Here is a draft. Ben, please review and improve as needed.
 
-It has been determined that 1024 byte objects in kernel slab may be lost when a call to do_pipe() fails. The problem occurs because do_pipe() allocates pipe files, and then tries to get free file descriptors for them.  If the process is out of file descriptors, do_pipe fails, but it does not clean up properly. A fix for this problem is planned for a forthcoming 5.3 kernel. 
+It has been determined that 1024 byte objects in kernel slab may be lost when a call to pipe() fails. The problem occurs because pipe() allocates pipe files, and then tries to get free file descriptors for them.  If the process is out of file descriptors, pipe() fails, but it does not clean up properly. A fix for this problem is planned for a forthcoming 5.3 kernel. 
 
 A workaround to avoid this problem is to ensure that the process calling do_pipe has adequate file descriptors. 
 
@@ -8,9 +8,9 @@
 
 32 fds + 1 fd per path
 
-For example if you have 32 LUNs with 4 paths each, use
+if this number is greater than the default of 1024. For example if you have 255 LUNs with 8 paths each, use
 
 defaults {
     ...
-    max_fds  160
+    max_fds  2072
 }

Comment 4 Tom Coughlan 2009-01-15 19:23:35 UTC
This BZ is covered by 480048 for device-mapper-multipath and 478643 for the kernel. Closing.

*** This bug has been marked as a duplicate of bug 480048 ***

Comment 5 Tom Coughlan 2009-01-15 19:23:35 UTC
Deleted Release Notes Contents.

Old Contents:
We need a 5.3 release note for this. Here is a draft. Ben, please review and improve as needed.

It has been determined that 1024 byte objects in kernel slab may be lost when a call to pipe() fails. The problem occurs because pipe() allocates pipe files, and then tries to get free file descriptors for them.  If the process is out of file descriptors, pipe() fails, but it does not clean up properly. A fix for this problem is planned for a forthcoming 5.3 kernel. 

A workaround to avoid this problem is to ensure that the process calling do_pipe has adequate file descriptors. 

This problem has been observed with multipathd in particular. To avoid the problem with multipahtd, you should set max_fds in the defaults section of multipath.conf to 

32 fds + 1 fd per path

if this number is greater than the default of 1024. For example if you have 255 LUNs with 8 paths each, use

defaults {
    ...
    max_fds  2072
}