Hide Forgot
Description of problem: When accidentally one node had new cluster.conf version and other nodes are suspended, cluster repeats these steps Feb 16 12:53:10 bar-04 corosync[1376]: [CMAN ] Activity suspended on this node Feb 16 12:53:10 bar-04 corosync[1376]: [CMAN ] Error reloading the configuration, will retry every second Feb 16 12:53:11 bar-04 corosync[1376]: [CMAN ] Unable to load new config in corosync: New configuration version has to be newer than current running configuration Feb 16 12:53:11 bar-04 corosync[1376]: [CMAN ] Can't get updated config version 18: New configuration version has to be newer than current running configuration#012. But it seems corosync is leaking some memory here, and after some time it ends with OOM. watch "ps axu |grep corosync" shows slowly increasing memory use. Version-Release number of selected component (if applicable): cman-3.0.12-23.el6_0.4.x86_64 corosync-1.2.3-21.el6.x86_64
I´ll take this one for now. It might be a cman problem.
(and this happens when ricci is not running - cluster miconfiguration)
(In reply to comment #3) > (and this happens when ricci is not running - cluster miconfiguration) The problem is slightly different tho. Your cluster had a non consistent cluster.conf around. One node had a version X and the others Y. ricci not running and a series of manual overrides (otherwise it´s simply not possible to get there), lead to that inconsistent setup. the nodes with older config will continue trying to load the new config in a loop in an attempt to recover from that situation. This reload loop seems to be leaking memory, but we will need to investigate exactly if that´s the issue and what component is at fault (there are 3/4 involved in that process).
Created attachment 479959 [details] reproducer The patch in attachment disables all cluster (cman/cman-preconfig/ccs_xml plugins) reload code, and adds an heavy loop to object_reload_config call. First startup is still functional, this way we basically exclude any cluster related code from configuration reload path and isolate the issue into corosync. How to reproduce quickly: - 2 nodes rhel6 - patch cluster (with attached one), build, install - edit cluster.conf on node1 to be version="1" - on node2 version="2" - start monitoring corosync memory usage - cman_tool -D join on node1 (config version 1) - wait a bit - cman_tool -D join on node2 (config version 2) corosync on node1 will loop almost immediately on objdb_reload_config and increase memory usage heavily in few seconds (aka be ready for a killall -9 corosync)
Created attachment 480106 [details] Proposed patch Main problem seems to be hidden in fact, that old code allocates X items in list, but deletion frees only X-1 items (everything but not tmplist).
Also please note that patch has side effect. Without this patch, trigger totem_objdb_reload_notify was never called. With patch, it's correctly called.
I just cross checked this patch and it does indeed fix at least one of the memory leaks in the reload path. When used in conjunction with a good cman, we still experience memory leak, but I am in the process to identify where we leak.
On Angus suggestion, running the corosync memory leak test from cts. The code here: corosync from rhel6 + above Honzaf´s patch and no cman or any cluster component loaded at all. Single node test. Note as long as we don´t eliminate this leak, I cannot easily verify possible leaks in cman/cluster on this same code path. [root@rhel6-node2 ~]# sh mem_leak_test.sh -1 1168 [root@rhel6-node2 ~]# sh mem_leak_test.sh -1 144 [root@rhel6-node2 ~]# sh mem_leak_test.sh -1 104 [root@rhel6-node2 ~]# sh mem_leak_test.sh -1 100 [root@rhel6-node2 ~]# sh mem_leak_test.sh -1 100 [root@rhel6-node2 ~]# sh mem_leak_test.sh -1 100 [root@rhel6-node2 ~]# sh mem_leak_test.sh -1 100 [root@rhel6-node2 ~]# sh mem_leak_test.sh -1 104 [root@rhel6-node2 ~]# sh mem_leak_test.sh -1 100 [root@rhel6-node2 ~]# sh mem_leak_test.sh -1 100 [root@rhel6-node2 ~]# sh mem_leak_test.sh -1 100 [root@rhel6-node2 ~]# sh mem_leak_test.sh -1 104 [root@rhel6-node2 ~]# sh mem_leak_test.sh -1 100
Created attachment 480464 [details] Proposed patch for remove leak of handles
With this second patch I can see that mem_leak_test goes down to 0 after 2 iterations. With cman we are still leaking 12 bytes per reload (down from 16). Note that all the code path that cman is using up to the fail is only related to objdb calls and one to xml. Xml has already been tested separately and shows no leaks.
More news: we found the remaining issue in config_xml.lcrso. the corosync bits are all good for what I can say. cloning the bz for cluster/cman.
Patches committed in upstream as: 41aeecc4eff296252a1ffc06f8c581ec90b9076d 894ece6a141c2d24a332a7375696615e38ca5375
Verified with corosync-1.2.3-28.el6.x86_64 2 node cluster, start up cluster (no ricci), increase version on one node and restart it, 2nd one begins the loop. Without the patch the memory footprint (RSS) increases by about 1M per minute. With the patch there was no increase for about 20 minutes of looping.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0764.html