Bug 212634
Summary: | rgmanager times out when using clustat | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Lenny Maiorani <lenny> | ||||||||||||||||
Component: | rgmanager | Assignee: | Lon Hohberger <lhh> | ||||||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||||||||||||||
Severity: | medium | Docs Contact: | |||||||||||||||||
Priority: | medium | ||||||||||||||||||
Version: | 4 | CC: | aberoham, cluster-maint, jplans, pdemauro, rkenna, tjaszowski | ||||||||||||||||
Target Milestone: | --- | ||||||||||||||||||
Target Release: | --- | ||||||||||||||||||
Hardware: | All | ||||||||||||||||||
OS: | Linux | ||||||||||||||||||
Whiteboard: | |||||||||||||||||||
Fixed In Version: | RHBA-2007-0149 | Doc Type: | Bug Fix | ||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||
Clone Of: | Environment: | ||||||||||||||||||
Last Closed: | 2007-05-10 21:19:27 UTC | Type: | --- | ||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||
Embargoed: | |||||||||||||||||||
Bug Depends On: | |||||||||||||||||||
Bug Blocks: | 218112 | ||||||||||||||||||
Attachments: |
|
Description
Lenny Maiorani
2006-10-27 19:50:49 UTC
Created attachment 139610 [details]
dlm debug and lock info from /proc
Lenny, when it can't allocate memory, is it userspace? E.g. is there any process obviously soaking up all memory on the system ? Memory usage was normal. Not sure if it is the 'cat' complaining about memory or the /proc fs. Can you get /proc/slabinfo from the nodes, and if possible, 'ps -auwwx' ? We do not have a way of reproducing this, but if it comes up again I will get this info. Lon, I am seeing this now on several clusters. They are all complaining in /proc/cluster/dlm_debug from clvmd. I will attach some logs... Created attachment 142198 [details]
/proc/cluster/dlm_debug
Created attachment 142199 [details]
/proc/meminfo
Created attachment 142200 [details]
ps -auwwx
Created attachment 142201 [details]
/proc/slabinfo
Lenny, I am pretty sure this is a bug in rgmanager which is produced by the clu_lock_verbose() function. I'll have a build ready soon. Since the clu_lock_verbose() function does nothing useful, I'm removing it from RHCS4 (it's already been removed in RHCS5). Created attachment 143442 [details]
Fixes subtle dlm lock leak created by rgmanager
Created attachment 143443 [details]
Source RPM with this patch + patch for 213312
Binary RPMs (*will* be removed when RHCS 4.5 becomes available): http://people.redhat.com/lhh/rgmanager-1.9.53-1.218112hf.src.rpm http://people.redhat.com/lhh/rgmanager-1.9.53-1.218112hf.x86_64.rpm http://people.redhat.com/lhh/rgmanager-1.9.53-1.218112hf.i386.rpm Fixes in CVS. Ok, I am running with this now. Let me get some bake time on it before declaring this the fix. Same fix(es), based on the 1.9.54 errata (exactly the same as .53, except it includes an NFS fix) http://people.redhat.com/lhh/rgmanager-1.9.54-2.218112hf.src.rpm http://people.redhat.com/lhh/rgmanager-1.9.54-2.218112hf.x86_64.rpm http://people.redhat.com/lhh/rgmanager-1.9.54-2.218112hf.i386.rpm *** Bug 230830 has been marked as a duplicate of this bug. *** Alternatively, we will be calling it 'beta' pretty soon. Can you specify what are "bad" values or increments to dlm_lkb in /proc/slabinfo? An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0149.html |