| Summary: | when existing ring file is zero bytes, corosync aborts | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Fabio Massimo Di Nitto <fdinitto> | ||||
| Component: | corosync | Assignee: | Steven Dake <sdake> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | urgent | ||||||
| Version: | 6.1 | CC: | cluster-maint, djansa, jfriesse, jkortus, jwest | ||||
| Target Milestone: | rc | Keywords: | ZStream | ||||
| Target Release: | 6.1 | ||||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | corosync-1.2.3-27.el5 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 675206 (view as bug list) | Environment: | |||||
| Last Closed: | 2011-05-19 14:24:18 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Bug Depends On: | |||||||
| Bug Blocks: | 675206, 696734 | ||||||
| Attachments: |
|
||||||
|
Description
Fabio Massimo Di Nitto
2011-02-04 09:58:50 UTC
Some more information. I found a bunch of files in /var/lib/corosync, including a ring_$somedata. After removing that file (it was 0 bytes), corosync starts again. The 100% cpu spinning is a different problem that I am investigating now. Very easy to reproduce too: start corosync ls -als /var/lib/corosync/ring* (take a note of the file name) stop corosync rm -rf /var/lib/corosync/* touch /var/lib/corosync/ringid_ (as above file name) chmod 700 /var/lib/corosync/ringid_ (file is create 700 by corosync) chown root:root .... now it should match the same file as above but size 0 instead of 4/8. corosync -f corosync: totemsrp.c:3106: memb_ring_id_create_or_load: Assertion `res == sizeof (unsigned long long)' failed. Aborted independent of the architecture. Suggested fix is always to unlink a file at startup time and recreate as needed, instead of rely on existing ones. one more side note.. I have no idea _how_ i got a 0 len file there.. but it was there. Created attachment 480217 [details]
upstream submitted patch to resolve this issue
verified with corosync-1.2.3-28.el6.x86_64. corosync starts correctly when old file is there, new zero file is there or if big file is there instead (50M). An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0764.html |