Description of problem: While running vgsplit test cases on my 28 node cluster, clogd segfaulted on one node causing the test case to fail. SCENARIO - [split_lv_from_vg_with_mirror] Split out a lv from vg with additional mirror on south-08 free pvs for south-08: /dev/sdb1 /dev/sdb2 /dev/sdc1 /dev/sdc2 /dev/sdc3 create a linear and a mirror in the same vg (different pvs) Error locking on node south-15: LV seven/mirror_mlog in use: not deactivating Aborting. Failed to deactivate mirror log. Manual intervention required. Failed to create mirror log. couldn't create logical volume On south-15 I found clogd was no longer running and this message in the logs: kernel: clogd[4975]: segfault at 0000000a00000009 rip 0000000000408489 rsp 00007fff0d3704c0 error 4 I reconfigured the system to capture core dumps and I'm running the testcase again. Version-Release number of selected component (if applicable): cmirror-1.1.33-1.el5 kmod-cmirror-0.1.20-1.el5 How reproducible: Unknown.
I've been able to reproduce this several times and I was able to capture a core dump. Core was generated by `clogd'. Program terminated with signal 11, Segmentation fault. [New process 5410] #0 0x0000000000408489 in export_checkpoint (cp=0x200000001f) at cluster.c:403 403 len = snprintf((char *)(name.value), SA_MAX_NAME_LENGTH, (gdb) list 398 int len; 399 char buf[32]; 400 401 LOG_DBG("Sending checkpointed data to %u", cp->requester); 402 403 len = snprintf((char *)(name.value), SA_MAX_NAME_LENGTH, 404 "bitmaps_%s_%u", SHORT_UUID(cp->uuid), cp->requester); 405 name.length = len; 406 407 len = strlen(cp->recovering_region) + 1; (gdb) info locals attr = {creationFlags = 1, checkpointSize = 77, retentionDuration = 9223372036854775807, maxSections = 4, maxSectionSize = 28, maxSectionIdSize = 22} h = 9213452461992312832 section_id = {idLen = 17, id = 0x7fff2b626ff0 "recovering_region"} section_attr = {sectionId = 0x7fff2b627130, expirationTime = 9223372036854775807} flags = 7 name = {length = 18, value = "bitmaps_cSNnJgFk_1\0003313008", '\0' <repeats 12 times>, "\230èb", '\0' <repeats 21 times>, "\001\000\000\000\000\000\000\000àob+ÿ\177\000\000\233Î`n6\000\000\000\210èb\000\000\000\000\000\000ý\000\000\000\000\000\0009\200ë\001\000\000\000\000\001\000\000\000\000\000\000\000¤\201", '\0' <repeats 22 times>, "¿\r\000\000\000\000\000\000\000\020\000\000\000\000\000\000\020\000\000\000\000\000\000\000ê¥\034I", '\0' <repeats 12 times>, "\230©\030I", '\0' <repeats 12 times>, "\230©\030I", '\0' <repeats 37 times>, "ý\000\000\000\000\000\0009\200ë\001\000\000\000\000`q"} rv = SA_AIS_OK tfr = (struct clog_tfr *) 0x1b9d5f70 len = 0 buf = "recovering_region\000b+ÿ\177\000\000Į̈n6\000\000" (gdb) info args cp = (struct checkpoint_data *) 0x200000001f (gdb) bt #0 0x0000000000408489 in export_checkpoint (cp=0x200000001f) at cluster.c:403 #1 0x000000000040e950 in do_checkpoints (entry=0x1b9d9520) at cluster.c:731 #2 0x000000000041176c in do_cluster_work (data=0x0) at cluster.c:858 #3 0x0000000000422ca0 in links_issue_callbacks () at link_mon.c:134 #4 0x0000000000401f9c in main (argc=1, argv=0x7fff2b627748) at clogd.c:55
Interesting, it appears that the 'cp' pointer is the problem: (gdb) p cp $1 = (struct checkpoint_data *) 0x200000001f (gdb) p *cp Cannot access memory at address 0x200000001f It looks like that pointer has been corrupted (freed?).
Created attachment 323936 [details] Valgrind suppression file. If it is easy to reproduce, please consider running clogd under valgrind. (You can use this suppression file to eliminate the many openAIS errors.)
Oh, kick ass! Nice catch. 'cp' comes from the 'checkpoint_list'. The very first entry (the list itself) is corrupted. So, in do_checkpoints: for (cp = entry->checkpoint_list; cp;) will assign a corrupted value to cp... The checkpoint_list pointer is being corrupted because it is overwritten from above. It lives in the structure 'clog_cpg', which is: struct clog_cpg { <snip> int checkpoints_needed; uint32_t checkpoint_requesters[10]; struct checkpoint_data *checkpoint_list; }; checkpoint_requesters is only 10 large - probably an initial chosen value that was never properly abstracted. So, if you have more than 11 nodes in your cluster, and they are all entering at once, you can write to 'checkpoint_requesters[10+]' and write into the address space of 'checkpoint_list'. The fix is very simple. Failure to fix simply means that 11+ node clusters could not be supported by cluster mirrors.
Fix checked into RHEL5 (waiting for flags before checking into rhel5.3). commit 1198fd8a342ad252273585aadc72f03f16e706d9 Author: Jonathan Brassow <jbrassow> Date: Tue Nov 18 12:20:44 2008 -0600 clogd: Fix for bug 471448 - clogd segfault on clusters > 10 nodes
commit 4873624ba58cb2a922b4ae05112c99369e4cb48d Author: Jonathan Brassow <jbrassow> Date: Tue Nov 18 12:20:44 2008 -0600 clogd: Fix for bug 471448 - clogd segfault on clusters > 10 nodes clogd is segfaulting due to a corrupted checkpoint_list pointer. The checkpoint_list pointer is being corrupted because it is overwritten from above. It lives in the structure 'clog_cpg', which is: struct clog_cpg { <snip> int checkpoints_needed; uint32_t checkpoint_requesters[10]; struct checkpoint_data *checkpoint_list; }; checkpoint_requesters is only 10 large - probably an initial chosen value t was never properly abstracted. So, if you have more than 11 nodes in your cluster, and they are all entering at once, you can write to 'checkpoint_requesters[10+]' and write into the address space of 'checkpoint_list'.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2009-0158.html