Red Hat Bugzilla – Bug 473961
clvmd memory leak
Last modified: 2014-06-30 07:52:20 EDT
Description of problem:
frequent calls to 'lvdisplay' and 'vgdisplay' cause clvmd to allocate more memory which is never later released...
Version-Release number of selected component (if applicable):
Currently we are using lvm2-cluster-2.02.32-4.el5
Steps to Reproduce:
1. Set up a simple cluster with a logical volume (clustered).
2. open one window with 'top' to monitor the '%MEM'
3. in another terminal window run this simple loop and watch the memory grow:
while [ 1 -eq 1 ]; do vgdisplay ; done
while [ 1 -eq 1 ]; do lvdisplay ; done
Currently we are using Nagios, so it's the frequent NRPE calls to vgdisplay and lvdisplay that are doing it for us...
clvmd accumulates to over 1 GIG of memory after 8 days or so...
I have profiled clvmd quite extensively and can't find a memory leak in that code.
However, I did find a very small, occasional leak in libcman - the library that clvmd uses to communication with the cluster manager.
It only occurs when messages for the client (clvmd) are queued up, but running vgdisplay in a tight loop as you suggest showed up one or two 288-byte blocks leaked over a few minutes so it is conceivable that they could add up to a gigabyte over several days given that clvmd will probably be doing other things too.
The patch to fix this is in the git master and STABLE2 branches and we have just missed RHEL5.3 so I'll add this to the RHEL5.4 update.
Author: Christine Caulfield <email@example.com>
Date: Fri Dec 12 10:30:12 2008 +0000
cman: fix memory leak
I'm uncertain to the access you have on your end; however ticket #1877308 explains a support representative unable to see this memory leak issue as resolved using the (new) stable RHEL branch.
He also claims (the representative) that this issue does not currently appear RHEL 5.3 which is partially good news. Therefore this issue got resolved by some other means... Our company can not upgraded to v 5.3 until it leaves it's beta stages... Even then we would require a long period of planning this upgrade.
These comments i'm providing i have not backed up myself; I'm just trusting the comments made by your support team over there.
We are impacted on this bug too.
On our clusters, some scripts check logical volume status, and cause memory usage of clvmd to grow...
The memory usage is growing on both nodes, not only on the node that run lvdisplay commands.
The resulting cluster is not stable...
This is a blocking point for us.
Good news !
Here are the new values, dated from today 22th :
VIT RES SHR
psp341 114m 88m 56m
psp342 114m 88m 56m
psu339 188m 98m 56m
psu340 178m 88m 56m
psi225 177m 87m 56m
psi227 113m 87m 56m
psi225 and psi227 are new.
The value did not increase... leakage seems to have disappeared.
~~ Attention - RHEL 5.4 Beta Released! ~~
RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner!
If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity.
Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value.
Questions can be posted to this bug or your customer or partner representative.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.