Bug 682770

Summary: RFE: support automatic ring recovery
Product: Red Hat Enterprise Linux 6 Reporter: Florian Haas <florian>
Component: corosyncAssignee: Steven Dake <sdake>
Status: CLOSED DUPLICATE QA Contact: Cluster QE <mspqa-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.3CC: cluster-maint, fdinitto
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-03-07 14:52:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Florian Haas 2011-03-07 14:44:26 UTC
Description of problem:
Corosync presently does not support automatic recovery of failed rings. As soon as a ring goes to the FAULTY state, administrative intervention (with corosync-cfgtool -r) is required to restore the link.

Version-Release number of selected component (if applicable):
1.2.3

How reproducible:
100%

Steps to Reproduce:
1. Break connectivity on a Corosync link (remove network cable, insert an iptables rule dropping corosync packets, etc.)
2. After about 15 seconds, run corosync-cfgtool -s. Observe link status is now FAULTY.
3. Restore the link (replug network cable, remove iptables rule, etc.)
  
Actual results:
Link remains FAULTY.

Expected results:
Link goes back to a healthy state.

Comment 1 Florian Haas 2011-03-07 14:45:22 UTC
Email thread on the openais mailing list: http://marc.info/?l=openais&m=129928521205527&w=2

Comment 2 Steven Dake 2011-03-07 14:52:20 UTC

*** This bug has been marked as a duplicate of bug 504022 ***