Bug 623644

Summary: multipathd consumes ten times more memory than on rhel5
Product: Red Hat Enterprise Linux 6 Reporter: michal novacek <mnovacek>
Component: device-mapper-multipathAssignee: Ben Marzinski <bmarzins>
Status: CLOSED ERRATA QA Contact: Gris Ge <fge>
Severity: high Docs Contact:
Priority: high    
Version: 6.0CC: abaron, agk, bdonahue, bmarzins, christophe.varoqui, coughlan, cpelland, cplisko, dwysocha, egoggin, fge, heinzm, junichi.nomura, kueda, lmb, mbroz, michael.hagmann, prockai, rmusil, tranlan
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: device-mapper-multipath-0.4.9-35.el6 Doc Type: Bug Fix
Doc Text:
The multipathd daemon consumed excessive memory when iSCI devices were unloaded and reloaded. This occurred because the daemon was caching unnecessary sysfs data, which caused memory leaks. With this update, multipathd no longer caches these data; it frees the data when the associated device is removed.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-05-19 14:11:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 672151    
Attachments:
Description Flags
requested /proc/<pid>/maps list none

Description michal novacek 2010-08-12 12:27:41 UTC
Description of problem: 
mutipathd consumes too much memory compared with rhel5 (~1G on rhel6 compared to ~100M on rhel5)

Version-Release number of selected component (if applicable):
rhel6-snap9, kernel: 2.6.32-54.el6.x86_64
device-mapper-multipath-0.4.9-24.el6.x86_64

How reproducible: always

Steps to Reproduce:
1. connect 20 iscsi devices	
2. start multipathd
3. observe memory consumed
   
Actual results: about 1G memory consumed.

Expected results: according to bz583898 there should be 10M per multipathed device plus 30M overhead.

Additional info: compared on the same hw with rhel5.5, software target.

Comment 2 Alasdair Kergon 2010-08-12 12:40:06 UTC
Please attach the evidence - detailed ps output, /proc/<pid>/maps.  Even 10M per device sounds quite wrong if this is physical memory.

Comment 4 RHEL Program Management 2010-08-12 12:58:29 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 5 michal novacek 2010-08-12 14:55:21 UTC
Created attachment 438460 [details]
requested /proc/<pid>/maps list


I still have the machine available if desired.

Comment 6 Ben Marzinski 2010-08-12 19:01:11 UTC
I do see that the virtual memory size has gone up.  However, I see only a slight change in physcial memory usage.  Are you seeing the same thing. or are you seeing  a large increase in physical memory usage as well?

Also, I don't think 583898 is the bugzilla you meant to refer to. That bugzilla is about iscsi failover, and doesn't mention multipath memory usage at all.

Comment 7 michal novacek 2010-08-13 08:46:41 UTC
That is what I observe -- virtual memory consumption increase.

The bug I had in mind is from this advisory: 
http://rhn.redhat.com/errata/RHBA-2009-1486.html

Comment 10 Ben Marzinski 2010-11-15 04:08:55 UTC
*** Bug 635555 has been marked as a duplicate of this bug. ***

Comment 11 Ayal Baron 2010-11-30 07:23:31 UTC
(In reply to comment #7)
> That is what I observe -- virtual memory consumption increase.
> 
> The bug I had in mind is from this advisory: 
> http://rhn.redhat.com/errata/RHBA-2009-1486.html

This is not only virtual memory increase.  copying comment from duplicate bug:

We recently noticed that on our systems (hosts running VDSM) the multipathd
memory footprint is suspiciously high.

That what top shows:

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+ COMMAND              
1625 root      RT   0 1764m 126m 3652 S  0.0  1.6  73:23.99 multipathd          

And here is pmap -x

[root@white-vdsd ~]# pmap -x 1625
1625:   /sbin/multipathd
Address           Kbytes     RSS   Dirty Mode   Mapping
0000000000400000      64      64       0 r-x--  multipathd (deleted)
0000000000610000      12      12      12 rw---  multipathd (deleted)
0000000002294000     532     532     532 rw---    [ anon ]
0000003452800000     120     120       0 r-x--  ld-2.12.so
0000003452a1e000       4       4       4 r----  ld-2.12.so
0000003452a1f000       4       4       4 rw---  ld-2.12.so
0000003452a20000       4       4       4 rw---    [ anon ]
0000003452c00000    1492    1492       0 r-x--  libc-2.12.so
0000003452d75000    2048       0       0 -----  libc-2.12.so
0000003452f75000      16      16       8 r----  libc-2.12.so
0000003452f79000       4       4       4 rw---  libc-2.12.so
0000003452f7a000      20      20      20 rw---    [ anon ]
0000003453000000       8       8       0 r-x--  libdl-2.12.so
0000003453002000    2048       0       0 -----  libdl-2.12.so
0000003453202000       4       4       4 r----  libdl-2.12.so
0000003453203000       4       4       4 rw---  libdl-2.12.so
0000003453400000      92      92       0 r-x--  libpthread-2.12.so
0000003453417000    2048       0       0 -----  libpthread-2.12.so
0000003453617000       4       4       4 r----  libpthread-2.12.so
0000003453618000       4       4       4 rw---  libpthread-2.12.so
0000003453619000      16      16      16 rw---    [ anon ]
0000003453800000     116     116       0 r-x--  libselinux.so.1
000000345381d000    2044       0       0 -----  libselinux.so.1
0000003453a1c000       4       4       4 r----  libselinux.so.1
0000003453a1d000       4       4       4 rw---  libselinux.so.1
0000003453a1e000       4       4       4 rw---    [ anon ]
0000003453c00000     280     280       0 r-x--  libmultipath.so
0000003453c46000    2044       0       0 -----  libmultipath.so
0000003453e45000      16      16      16 rw---  libmultipath.so
0000003453e49000       4       4       4 rw---    [ anon ]
0000003454000000     136     136       0 r-x--  libncurses.so.5.7
0000003454022000    2044       0       0 -----  libncurses.so.5.7
0000003454221000       4       4       4 rw---  libncurses.so.5.7
0000003454400000     524     524       0 r-x--  libm-2.12.so
0000003454483000    2044       0       0 -----  libm-2.12.so
0000003454682000       4       4       4 r----  libm-2.12.so
0000003454683000       4       4       4 rw---  libm-2.12.so
0000003454800000     116     116       0 r-x--  libtinfo.so.5.7
000000345481d000    2048       0       0 -----  libtinfo.so.5.7
0000003454a1d000      16      16      16 rw---  libtinfo.so.5.7
0000003454c00000     240     240       0 r-x--  libsepol.so.1
0000003454c3c000    2044       0       0 -----  libsepol.so.1
0000003454e3b000       4       4       4 rw---  libsepol.so.1
0000003459000000     232     232       0 r-x--  libreadline.so.6.0
000000345903a000    2048       0       0 -----  libreadline.so.6.0
000000345923a000      32      32      32 rw---  libreadline.so.6.0
0000003459242000       4       4       4 rw---    [ anon ]
000000345b000000      56      56       0 r-x--  libudev.so.0.5.1
000000345b00e000    2048       0       0 -----  libudev.so.0.5.1
000000345b20e000       4       4       4 rw---  libudev.so.0.5.1
0000003b0b600000       4       4       0 r-x--  libaio.so.1.0.1
0000003b0b601000    2044       4       0 -----  libaio.so.1.0.1
0000003b0b800000       4       4       4 rw---  libaio.so.1.0.1
00007ff830000000   56292   56292   56292 rw---    [ anon ]
00007ff8336f9000    9244       0       0 -----    [ anon ]
00007ff838000000     132      72      72 rw---    [ anon ]
00007ff838021000   65404       0       0 -----    [ anon ]
00007ff83c000000     132      72      72 rw---    [ anon ]
00007ff83c021000   65404       0       0 -----    [ anon ]
00007ff840000000     132      72      72 rw---    [ anon ]
00007ff840021000   65404       0       0 -----    [ anon ]
00007ff844000000     132      72      72 rw---    [ anon ]
00007ff844021000   65404       0       0 -----    [ anon ]
00007ff848000000     132      72      72 rw---    [ anon ]
00007ff848021000   65404       0       0 -----    [ anon ]
00007ff84c000000     132      72      72 rw---    [ anon ]
00007ff84c021000   65404       0       0 -----    [ anon ]
00007ff850000000     132      72      72 rw---    [ anon ]
00007ff850021000   65404       0       0 -----    [ anon ]
00007ff854000000     132      72      72 rw---    [ anon ]
00007ff854021000   65404       0       0 -----    [ anon ]
00007ff858000000     132      72      72 rw---    [ anon ]
00007ff858021000   65404       0       0 -----    [ anon ]
00007ff85c000000     132      72      72 rw---    [ anon ]
00007ff85c021000   65404       0       0 -----    [ anon ]
00007ff860000000     132     108     108 rw---    [ anon ]
00007ff860021000   65404       0       0 -----    [ anon ]
00007ff864000000     132      72      72 rw---    [ anon ]
00007ff864021000   65404       0       0 -----    [ anon ]
00007ff868000000     132      72      72 rw---    [ anon ]
00007ff868021000   65404       0       0 -----    [ anon ]
00007ff86c000000     132      72      72 rw---    [ anon ]
00007ff86c021000   65404       0       0 -----    [ anon ]
00007ff870000000   65536   65536   65536 rw---    [ anon ]
00007ff874000000     132     108     108 rw---    [ anon ]
00007ff874021000   65404       0       0 -----    [ anon ]
00007ff878000000     132      20      20 rw---    [ anon ]
00007ff878021000   65404       0       0 -----    [ anon ]
00007ff87c000000     132      72      72 rw---    [ anon ]
00007ff87c021000   65404       0       0 -----    [ anon ]
00007ff880000000     132      20      20 rw---    [ anon ]
00007ff880021000   65404       0       0 -----    [ anon ]
00007ff888000000     132      20      20 rw---    [ anon ]
00007ff888021000   65404       0       0 -----    [ anon ]
00007ff890000000     132      20      20 rw---    [ anon ]
00007ff890021000   65404       0       0 -----    [ anon ]
00007ff898000000     132      72      72 rw---    [ anon ]
00007ff898021000   65404       0       0 -----    [ anon ]
00007ff8a0000000     132      20      20 rw---    [ anon ]
00007ff8a0021000   65404       0       0 -----    [ anon ]
00007ff8a8000000     132       8       8 rw---    [ anon ]
00007ff8a8021000   65404       0       0 -----    [ anon ]
00007ff8b0000000     132      96      96 rw---    [ anon ]
00007ff8b0021000   65404       0       0 -----    [ anon ]
00007ff8b8000000     132      16      16 rw---    [ anon ]
00007ff8b8021000   65404       0       0 -----    [ anon ]
00007ff8be537000    1032    1032    1032 rw---    [ anon ]
00007ff8be6b6000       4       4       4 -----    [ anon ]
00007ff8be6b7000      32      32      32 rw---    [ anon ]
00007ff8be6bf000       4       4       4 -----    [ anon ]
00007ff8be6c0000      32      32      32 rw---    [ anon ]
00007ff8be6c8000       4       4       4 -----    [ anon ]
00007ff8be6c9000      32      32      32 rw---    [ anon ]
00007ff8be6d1000       4       4       4 -----    [ anon ]
00007ff8be6d2000      32      32      32 rw---    [ anon ]
00007ff8be6da000       4       4       4 -----    [ anon ]
00007ff8be6db000      32      32      32 rw---    [ anon ]
00007ff8be6e3000       4       4       4 -----    [ anon ]
00007ff8be6e4000      32      32      32 rw---    [ anon ]
00007ff8be6ec000       4       4       4 -----    [ anon ]
00007ff8be6ed000      32      32      32 rw---    [ anon ]
00007ff8be6f5000       4       4       4 -----    [ anon ]
00007ff8be6f6000      36      36      36 rw---    [ anon ]
00007ff8be6ff000       4       4       4 -----    [ anon ]
00007ff8be700000      32      32      32 rw---    [ anon ]
00007ff8be708000       4       4       4 -----    [ anon ]
00007ff8be709000      32      32      32 rw---    [ anon ]
00007ff8be711000       4       4       4 -----    [ anon ]
00007ff8be712000      32      32      32 rw---    [ anon ]
00007ff8be71a000       4       4       4 -----    [ anon ]
00007ff8be71b000      32      32      32 rw---    [ anon ]
00007ff8be723000       4       4       4 -----    [ anon ]
00007ff8be724000      32      32      32 rw---    [ anon ]
00007ff8be72c000       4       4       4 -----    [ anon ]
00007ff8be72d000      32      32      32 rw---    [ anon ]
00007ff8be735000       4       4       4 -----    [ anon ]
00007ff8be736000      28      28      28 rw---    [ anon ]
00007ff8be73d000       4       4       4 -----    [ anon ]
00007ff8be73e000      64      64      64 rw---    [ anon ]
00007ff8be74e000       4       4       4 -----    [ anon ]
00007ff8be74f000      64      64      64 rw---    [ anon ]
00007ff8be75f000       4       4       4 -----    [ anon ]
00007ff8be760000      64      64      64 rw---    [ anon ]
00007ff8be770000       4       4       4 -----    [ anon ]
00007ff8be771000      64      64      64 rw---    [ anon ]
00007ff8be781000       4       4       4 -----    [ anon ]
00007ff8be782000      28      28      28 rw---    [ anon ]
00007ff8be789000       4       4       4 -----    [ anon ]
00007ff8be78a000      28      28      28 rw---    [ anon ]
00007ff8be791000       4       4       4 -----    [ anon ]
00007ff8be792000      28      28      28 rw---    [ anon ]
00007ff8be799000       4       4       4 -----    [ anon ]
00007ff8be79a000      28      28      28 rw---    [ anon ]
00007ff8be7a1000       4       4       4 -----    [ anon ]
00007ff8be7a2000      28      28      28 rw---    [ anon ]
00007ff8be7a9000       4       4       4 -----    [ anon ]
00007ff8be7aa000      48      48      48 rw---    [ anon ]
00007ff8be7b6000       4       4       0 r-x--  libprioconst.so
00007ff8be7b7000    2044       0       0 -----  libprioconst.so
00007ff8be9b6000       4       4       4 rw---  libprioconst.so
00007ff8be9b7000       8       8       0 r-x--  libcheckdirectio.so
00007ff8be9b9000    2044       4       0 -----  libcheckdirectio.so
00007ff8bebb8000       4       4       4 rw---  libcheckdirectio.so
00007ff8bebb9000       4       4       4 -----    [ anon ]
00007ff8bebba000      92      92      92 rw---    [ anon ]
00007ff8bebd1000     140     140       0 r-x-- 
libdevmapper.so.1.02.#prelink#.TNCsif (deleted)
00007ff8bebf4000    2048       0       0 ----- 
libdevmapper.so.1.02.#prelink#.TNCsif (deleted)
00007ff8bedf4000       8       8       8 rw--- 
libdevmapper.so.1.02.#prelink#.TNCsif (deleted)
00007ff8bedf6000      36      36      36 rw---    [ anon ]
00007fffcff04000      84      84      84 rw---    [ stack ]
00007fffcffff000       4       4       0 r-x--    [ anon ]
ffffffffff600000       4       0       0 r-x--    [ anon ]
----------------  ------  ------  ------
total kB         1806824  130012  126360


That is on RHEL6:

[root@white-vdsd ~]# rpm -qa | grep multip
device-mapper-multipath-0.4.9-25.el6.x86_64
device-mapper-multipath-libs-0.4.9-25.el6.x86_64
[root@white-vdsd ~]#

Comment 14 Ben Marzinski 2010-12-14 04:28:44 UTC
Unfortunately, this one isn't falling into place as easily as I was hoping. I do see a memory difference between RHEL5 and RHEL6, but if I compile the RHEL5 code on RHEL6, I see the exact same memory usage as the RHEL6 code has.  However, the memory issue you are seeing may very well still be multipathd's fault. I have not been able actually make multipathd use up anywhere near so much physical memory on RHEL6, so I can't tell if it's possible to see this huge physical memory usage on the RHEL5 code as well.  How are you doing that?

Do you see multipathd use up that much physical memory right from the start?  If not, what sort of tests are your running when the memory use increases.  Is it possible to occassionally run something like

# ps -eo comm,pid,rss,vsz,pmem | grep multipathd

to track the physical memory use as it increases, to help pin down what is causing this. Would it be possible for me to get on a system that can reproduce this issue?

Comment 18 Ben Marzinski 2011-01-24 05:11:53 UTC
Fixed two memory leaks associated with caching sysfs data.

Comment 21 Gris Ge 2011-03-11 03:35:11 UTC
Reproduced this issue in RHEL6 GA kernel 72.

Verified mem leak has been fixed in kernel-2.6.32-118.el6.x86_64.

This is the script for test:

#!/bin/bash 
for X in `seq 1 10`;do
    multipath -F >/dev/null 2>&1
    modprobe -r scsi_debug
    modprobe scsi_debug max_luns=1 add_host=4 num_tgts=2 vpd_use_hostno=0
    sleep 4
    ps u -p `pidof multipathd`
done

Comment 22 Eva Kopalova 2011-05-02 13:52:58 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
The multipathd daemon consumed excessive memory when iSCI devices were unloaded and reloaded. This occurred because the daemon was caching unnecessary sysfs data, which caused memory leaks. With this update, multipathd no longer caches these data; it frees the data when the associated device is removed.

Comment 23 errata-xmlrpc 2011-05-19 14:11:51 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0725.html