Hide Forgot
Description of problem: on x86_64: [root@rhel6-node2 corosync]# corosync -f corosync: totemsrp.c:3073: memb_ring_id_create_or_load: Assertion `res == sizeof (unsigned long long)' failed. Aborted on i386 it starts, but any access will result in 100% cpu spinning. Version-Release number of selected component (if applicable): corosync-1.2.3-21.el6 How reproducible: as shows above Additional info: nodes 1 and 2 are latest RHEL6.1. Selinux enable/disable makes no difference (selinux is currently disable in the above assertion). iptables are off. Issue is triggered either via corosync standalone startup or via cman.
Some more information. I found a bunch of files in /var/lib/corosync, including a ring_$somedata. After removing that file (it was 0 bytes), corosync starts again. The 100% cpu spinning is a different problem that I am investigating now.
Very easy to reproduce too: start corosync ls -als /var/lib/corosync/ring* (take a note of the file name) stop corosync rm -rf /var/lib/corosync/* touch /var/lib/corosync/ringid_ (as above file name) chmod 700 /var/lib/corosync/ringid_ (file is create 700 by corosync) chown root:root .... now it should match the same file as above but size 0 instead of 4/8. corosync -f corosync: totemsrp.c:3106: memb_ring_id_create_or_load: Assertion `res == sizeof (unsigned long long)' failed. Aborted independent of the architecture. Suggested fix is always to unlink a file at startup time and recreate as needed, instead of rely on existing ones.
one more side note.. I have no idea _how_ i got a 0 len file there.. but it was there.
Created attachment 480217 [details] upstream submitted patch to resolve this issue
verified with corosync-1.2.3-28.el6.x86_64. corosync starts correctly when old file is there, new zero file is there or if big file is there instead (50M).
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0764.html