Bug 132426 - ccsd memory leak
ccsd memory leak
Status: CLOSED NEXTRELEASE
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: gfs (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Jonathan Earl Brassow
GFS Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-09-13 05:58 EDT by Christine Caulfield
Modified: 2010-01-11 21:57 EST (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-09-16 03:57:49 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Christine Caulfield 2004-09-13 05:58:09 EDT
Description of problem:
Running Dean's join/leave script causes ccsd to allocate more and more
memory until it gets killed by the OOM killer.

while true 
do 
  sleep 1 
  cman_tool leave 
  sleep 1 
  cman_tool join 
done 

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Run above script
2. Run top in aother window and press M to watch ccsd climb up the list
3.
    

Additional info:
Comment 1 Christine Caulfield 2004-09-13 06:18:29 EDT
One interesting additional piece of information: If I "cman_tool
leave" the cluster and leave ccsd running, it continues to allocate
memory (according to top). When I join thew cluster again it settles
down again.
Comment 2 Jonathan Earl Brassow 2004-09-13 16:52:27 EDT
At least one memory leak is located in ccs/daemon/misc.c:get_cluster_name

The normal operation does not free the xml structures.
Comment 3 Jonathan Earl Brassow 2004-09-13 17:55:27 EDT
should be fixed... the above was all I found.
Comment 4 Christine Caulfield 2004-09-14 03:49:30 EDT
That helps.

It gets rid of the leak when the node is not a cluster member. But
when doing the loop test there is still a small leak coming from
somewhere.
Comment 5 Jonathan Earl Brassow 2004-09-14 11:08:11 EDT
Firstly, does the cluster remain quorate when you are doing "the loop test"?


Valgrind seems to indicate that this is in the ld library?

Note that many of these gripes are simply state that is held until exit - and 
therefore, not the memory leak.

==21570== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 5122 
from 1)
==21570== malloc/free: in use at exit: 83777 bytes in 2557 blocks.
==21570== malloc/free: 7430188 allocs, 7427631 frees, 497194386 bytes 
allocated.
==21570== For counts of detected errors, rerun with: -v
==21570== searching for pointers to 2557 not-freed blocks.
==21570== checked 2893752 bytes.
==21570== 
==21570== 8 bytes in 1 blocks are still reachable in loss record 1 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0xC544EF: strdup (in /lib/tls/libc-2.3.3.so)
==21570==    by 0x8052F1E: get_cluster_name (misc.c:136)
==21570==    by 0x804E328: process_connect (cnx_mgr.c:586)
==21570== 
==21570== 
==21570== 8 bytes in 1 blocks are still reachable in loss record 2 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x80501C0: process_request (cnx_mgr.c:1071)
==21570==    by 0x8049F0D: main (ccsd.c:195)
==21570== 
==21570== 
==21570== 8 bytes in 1 blocks are still reachable in loss record 3 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x8050565: process_broadcast (cnx_mgr.c:1211)
==21570==    by 0x804A04F: main (ccsd.c:201)
==21570== 
==21570== 
==21570== 12 bytes in 1 blocks are still reachable in loss record 4 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x625822: xmlHashCreate (in /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x64C1EA: xmlXPathNewContext (in /usr/lib/
libxml2.so.2.6.8)
==21570==    by 0x8052D78: get_cluster_name (misc.c:109)
==21570== 
==21570== 
==21570== 20 bytes in 1 blocks are still reachable in loss record 5 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x804C606: broadcast_for_doc (cnx_mgr.c:235)
==21570==    by 0x804DEF0: process_connect (cnx_mgr.c:638)
==21570==    by 0x804FCA7: process_request (cnx_mgr.c:1088)
==21570== 
==21570== 
==21570== 20 bytes in 1 blocks are still reachable in loss record 6 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x804FC53: process_request (cnx_mgr.c:1055)
==21570==    by 0x8049F0D: main (ccsd.c:195)
==21570== 
==21570== 
==21570== 24 bytes in 1 blocks are still reachable in loss record 7 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x647245: (within /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x64C2ED: xmlXPathNewParserContext (in /usr/lib/
libxml2.so.2.6.8)
==21570==    by 0x657C39: xmlXPathEvalExpression (in /usr/lib/
libxml2.so.2.6.8)
==21570== 
==21570== 
==21570== 24 bytes in 2 blocks are still reachable in loss record 8 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x649978: xmlXPathNodeSetCreate (in /usr/lib/
libxml2.so.2.6.8)
==21570==    by 0x64A422: xmlXPathNewNodeSet (in /usr/lib/
libxml2.so.2.6.8)
==21570==    by 0x64E9A9: xmlXPathRoot (in /usr/lib/libxml2.so.2.6.8)
==21570== 
==21570== 
==21570== 26 bytes in 1 blocks are still reachable in loss record 9 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x804A79E: parse_cli_args (ccsd.c:282)
==21570==    by 0x8049AAA: main (ccsd.c:55)
==21570== 
==21570== 
==21570== 26 bytes in 1 blocks are still reachable in loss record 10 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x804A777: parse_cli_args (ccsd.c:281)
==21570==    by 0x8049AAA: main (ccsd.c:55)
==21570== 
==21570== 
==21570== 32 bytes in 2 blocks are still reachable in loss record 11 of 36
==21570==    at 0x1B9033FD: calloc (vg_replace_malloc.c:176)
==21570==    by 0xD0C308: _dlerror_run (in /lib/libdl-2.3.3.so)
==21570==    by 0xD0BED0: dlsym (in /lib/libdl-2.3.3.so)
==21570==    by 0x1B91B61E: open64 (vg_libpthread.c:2331)
==21570== 
==21570== 
==21570== 40 bytes in 1 blocks are still reachable in loss record 12 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x6499A6: xmlXPathNodeSetCreate (in /usr/lib/
libxml2.so.2.6.8)
==21570==    by 0x64A422: xmlXPathNewNodeSet (in /usr/lib/
libxml2.so.2.6.8)
==21570==    by 0x64E9A9: xmlXPathRoot (in /usr/lib/libxml2.so.2.6.8)
==21570== 
==21570== 
==21570== 40 bytes in 1 blocks are still reachable in loss record 13 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x64A3F8: xmlXPathNewNodeSet (in /usr/lib/
libxml2.so.2.6.8)
==21570==    by 0x64E9A9: xmlXPathRoot (in /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x65643C: (within /usr/lib/libxml2.so.2.6.8)
==21570== 
==21570== 
==21570== 40 bytes in 1 blocks are still reachable in loss record 14 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x65753E: (within /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x657C43: xmlXPathEvalExpression (in /usr/lib/
libxml2.so.2.6.8)
==21570==    by 0x8052D93: get_cluster_name (misc.c:116)
==21570== 
==21570== 
==21570== 40 bytes in 1 blocks are still reachable in loss record 15 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x804E4A1: process_connect (cnx_mgr.c:532)
==21570==    by 0x804FCA7: process_request (cnx_mgr.c:1088)
==21570==    by 0x8049F0D: main (ccsd.c:195)
==21570== 
==21570== 
==21570== 40 bytes in 2 blocks are still reachable in loss record 16 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x1B94502C: clist_insert (clist.c:69)
==21570==    by 0x1B9497C0: msg_listen (message.c:552)
==21570==    by 0x8051B46: cluster_communicator (cluster_mgr.c:298)
==21570== 
==21570== 
==21570== 44 bytes in 1 blocks are still reachable in loss record 17 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x64C2C8: xmlXPathNewParserContext (in /usr/lib/
libxml2.so.2.6.8)
==21570==    by 0x657C39: xmlXPathEvalExpression (in /usr/lib/
libxml2.so.2.6.8)
==21570==    by 0x8052D93: get_cluster_name (misc.c:116)
==21570== 
==21570== 
==21570== 48 bytes in 2 blocks are still reachable in loss record 18 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x625FB1: xmlHashAddEntry3 (in /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x625C89: xmlHashAddEntry2 (in /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x64AED4: xmlXPathRegisterFuncNS (in /usr/lib/
libxml2.so.2.6.8)
==21570== 
==21570== 
==21570== 48 bytes in 2 blocks are still reachable in loss record 19 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x66B392: xmlNewMutex (in /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x66A508: xmlInitGlobals (in /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x61B70F: xmlInitParser (in /usr/lib/libxml2.so.2.6.8)
==21570== 
==21570== 
==21570== 68 bytes in 1 blocks are possibly lost in loss record 20 of 36
==21570==    at 0x1B9033FD: calloc (vg_replace_malloc.c:176)
==21570==    by 0x1B8F1E38: _dl_allocate_tls_storage (in /lib/ld-2.3.3.so)
==21570==    by 0x1B8F26A8: __GI__dl_allocate_tls (in /lib/ld-2.3.3.so)
==21570==    by 0x1B918550: pthread_create (vg_libpthread.c:1155)
==21570== 
==21570== 
==21570== 160 bytes in 8 blocks are still reachable in loss record 21 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x601C3B: xmlNewCharEncodingHandler (in /usr/lib/
libxml2.so.2.6.8)
==21570==    by 0x601D7D: xmlInitCharEncodingHandlers (in /usr/lib/
libxml2.so.2.6.8)
==21570==    by 0x61B71E: xmlInitParser (in /usr/lib/libxml2.so.2.6.8)
==21570== 
==21570== 
==21570== 176 bytes in 2 blocks are still reachable in loss record 22 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x61E403: xmlNewDoc (in /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x69E7AB: xmlSAX2StartDocument (in /usr/lib/
libxml2.so.2.6.8)
==21570==    by 0x61736D: xmlParseDocument (in /usr/lib/libxml2.so.2.6.8)
==21570== 
==21570== 
==21570== 196 bytes in 1 blocks are still reachable in loss record 23 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x64C195: xmlXPathNewContext (in /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x8052D78: get_cluster_name (misc.c:109)
==21570==    by 0x804CA0A: broadcast_for_doc (cnx_mgr.c:381)
==21570== 
==21570== 
==21570== 200 bytes in 1 blocks are still reachable in loss record 24 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x601CF5: xmlInitCharEncodingHandlers (in /usr/lib/
libxml2.so.2.6.8)
==21570==    by 0x61B71E: xmlInitParser (in /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x61AEEB: xmlSAXParseFileWithData (in /usr/lib/
libxml2.so.2.6.8)
==21570== 
==21570== 
==21570== 320 bytes in 1 blocks are still reachable in loss record 25 of 36
==21570==    at 0x1B9034EA: realloc (vg_replace_malloc.c:197)
==21570==    by 0x649E19: xmlXPathNodeSetAddUnique (in /usr/lib/
libxml2.so.2.6.8)
==21570==    by 0x6549EC: (within /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x6565B5: (within /usr/lib/libxml2.so.2.6.8)
==21570== 
==21570== 
==21570== 400 bytes in 1 blocks are still reachable in loss record 26 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x647270: (within /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x64C2ED: xmlXPathNewParserContext (in /usr/lib/
libxml2.so.2.6.8)
==21570==    by 0x657C39: xmlXPathEvalExpression (in /usr/lib/
libxml2.so.2.6.8)
==21570== 
==21570== 
==21570== 1232 bytes in 1 blocks are possibly lost in loss record 28 of 36
==21570==    at 0x1B9035B5: memalign (vg_replace_malloc.c:217)
==21570==    by 0x1B8F1DF1: _dl_allocate_tls_storage (in /lib/ld-2.3.3.so)
==21570==    by 0x1B8F26A8: __GI__dl_allocate_tls (in /lib/ld-2.3.3.so)
==21570==    by 0x1B918550: pthread_create (vg_libpthread.c:1155)
==21570== 
==21570== 
==21570== 1677 bytes in 1 blocks are still reachable in loss record 29 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x804C97A: broadcast_for_doc (cnx_mgr.c:365)
==21570==    by 0x804DEF0: process_connect (cnx_mgr.c:638)
==21570==    by 0x804FCA7: process_request (cnx_mgr.c:1088)
==21570== 
==21570== 
==21570== 2786 bytes in 427 blocks are still reachable in loss record 30 of 
36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x66D7E5: xmlStrndup (in /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x66D883: xmlStrdup (in /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x601C26: xmlNewCharEncodingHandler (in /usr/lib/
libxml2.so.2.6.8)
==21570== 
==21570== 
==21570== 4124 bytes in 1 blocks are possibly lost in loss record 31 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0xC700C7: opendir (in /lib/tls/libc-2.3.3.so)
==21570==    by 0x1B942708: clu_connect (global.c:63)
==21570==    by 0x8051BA3: cluster_communicator (cluster_mgr.c:310)
==21570== 
==21570== 
==21570== 4416 bytes in 92 blocks are still reachable in loss record 32 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x61F516: xmlNewNsProp (in /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x6A0486: (within /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x6A072F: xmlSAX2StartElementNs (in /usr/lib/
libxml2.so.2.6.8)
==21570== 
==21570== 
==21570== 4440 bytes in 74 blocks are still reachable in loss record 33 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x61FB51: xmlNewNode (in /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x61FCA8: xmlNewDocNode (in /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x6A097E: xmlSAX2StartElementNs (in /usr/lib/
libxml2.so.2.6.8)
==21570== 
==21570== 
==21570== 6144 bytes in 1 blocks are still reachable in loss record 34 of 36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x625844: xmlHashCreate (in /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x64C1EA: xmlXPathNewContext (in /usr/lib/
libxml2.so.2.6.8)
==21570==    by 0x8052D78: get_cluster_name (misc.c:109)
==21570== 
==21570== 
==21570== 13140 bytes in 219 blocks are still reachable in loss record 35 of 
36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x69FE7E: (within /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x6A0034: (within /usr/lib/libxml2.so.2.6.8)
==21570==    by 0x6A072F: xmlSAX2StartElementNs (in /usr/lib/
libxml2.so.2.6.8)
==21570== 
==21570== 
==21570== 43350 bytes in 1700 blocks are definitely lost in loss record 36 of 
36
==21570==    at 0x1B902A80: malloc (vg_replace_malloc.c:131)
==21570==    by 0x1B8E86E3: _dl_map_object (in /lib/ld-2.3.3.so)
==21570==    by 0xCDA113: dl_open_worker (in /lib/tls/libc-2.3.3.so)
==21570==    by 0x1B8EFB45: _dl_catch_error (in /lib/ld-2.3.3.so)
==21570== 
==21570== LEAK SUMMARY:
==21570==    definitely lost: 43350 bytes in 1700 blocks.
==21570==    possibly lost:   5424 bytes in 3 blocks.
==21570==    still reachable: 34603 bytes in 852 blocks.
==21570==         suppressed: 400 bytes in 2 blocks.
Comment 6 Christine Caulfield 2004-09-15 08:58:28 EDT
It's yer threads.

You either need to call pthread_join() on the thread when it exits, or
call pthread_detach() on it after creation so it can clean up after
itself.

I tried pthread_detach() and it gets rid of the leak for me, but I'll
let you decide which solution you prefer.
Comment 7 Jonathan Earl Brassow 2004-09-15 13:29:12 EDT
I added the pthread_detach()

I still see memory increasing when using valgrind, but otherwise, not...  maybe 
something in valgrind?

Anyway, I don't see the problem anymore.
Comment 8 Christine Caulfield 2004-09-16 03:57:49 EDT
Looks fine to me.
Comment 9 Kiersten (Kerri) Anderson 2004-11-16 14:10:29 EST
Updating version to the right level in the defects.  Sorry for the storm.

Note You need to log in before you can comment on or make changes to this bug.