Bug 735287

Summary: corosync crashes sometimes on takeover
Product: Red Hat Enterprise Linux 6 Reporter: Robert <spam>
Component: corosyncAssignee: Steven Dake <sdake>
Status: CLOSED DUPLICATE QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.1CC: cluster-maint, spam
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-09-02 17:04:22 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
output of corosync-fplay
none
one of the crash dumps (matches the corosync-fplay) none

Description Robert 2011-09-02 07:32:51 UTC
Created attachment 521159 [details]
output of corosync-fplay

Description of problem:

I'm testing a cluster setup with rhel 6.1 and I've a problem which stops me from deploying that cluster with rhel in production. Sometimes when I reboot a node with active services the corosync on the node which should to the take over crashes. It can work for some reboots and than suddenly I run into the problem. The 2 nodes are running as Vmware instances on different hosts, across 2 data centers (> 10 km distance).

Version-Release number of selected component (if applicable):

corosync-1.2.3-36.el6.x86_64
corosynclib-1.2.3-36.el6.x86_64


How reproducible:

reboot many times and sometimes it happens.

Actual results:

crash of corosync and 100% cpu for pacemaker

Expected results:

the Cluster should swith correctly

Additional info:

see attachments

Comment 2 Robert 2011-09-02 08:03:37 UTC
Created attachment 521165 [details]
one of the crash dumps (matches the corosync-fplay)

Comment 3 Steven Dake 2011-09-02 17:04:22 UTC

*** This bug has been marked as a duplicate of bug 671575 ***

Comment 4 Steven Dake 2011-09-02 17:06:53 UTC
Robert,

Generally I believe your deployment would need an architecture review before being processed by sales.  Also, bugs should come in from support not directly from customers.

Best wishes
-steve