Bug 505258

Summary: cman_tool leave remove does not reduce quorum
Product: Red Hat Enterprise Linux 5 Reporter: Christine Caulfield <ccaulfie>
Component: cmanAssignee: Christine Caulfield <ccaulfie>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 5.3CC: cfeist, cluster-maint, edamato, grimme, sghosh, slords, syeghiay
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: cman-2.0.106-1.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 515446 (view as bug list) Environment:
Last Closed: 2009-09-02 11:06:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 506768, 515446    

Description Christine Caulfield 2009-06-11 08:51:04 UTC
Description of problem:

If a node is removed from the cluster using the "cman_tool leave remove" command, the quorum is not recalculated to keep the cluster quorate.

Version-Release number of selected component (if applicable):
5.3

How reproducible:
Easily

Steps to Reproduce:
1. Start up a cluster. I used 3 nodes
2. Check expected votes is 3 ('cman_tool status')
3. Remove one node with 'cman_tool leave remove'
  
Actual results:

'cman_tool status' shows that expected_votes and quorum have not been reduced and the cluster is now inquorate.

Expected results:
Quorum to be reduced to 2 and the cluster remains quorate


Additional info:
There seems to be a missing conditional in cman/commands.c so that quorum is recalculated the 'normal' way AFTER it has been reduced by the removed node.

Comment 1 Christine Caulfield 2009-06-11 10:05:42 UTC
Checked in for 5.4:

commit 935a60f838d37c848405d7df17404c3adad78392
Author: Christine Caulfield <ccaulfie>
Date:   Tue Jan 20 14:14:26 2009 +0000

    cman: send fewer messages for each state transition.

Comment 5 Nate Straz 2009-07-24 17:50:04 UTC
I'm not clear how this is supposed to work.  

> Expected results:
> Quorum to be reduced to 2 and the cluster remains quorate

In a three node cluster, quorum should have been 2 to start with.  Did you mean expected votes here?

I'm trying this out with a four node cluster where one node has been removed as directed.


[root@z3 ~]# cman_tool status
...
Nodes: 3
Expected votes: 4
Total votes: 3
Quorum: 3

With three nodes left in the cluster, should quorum have dropped to 2?

Comment 6 Christine Caulfield 2009-07-28 07:02:37 UTC
No.

Cman only adjusts expected votes and quorum when it has to, to maintain quorate state. So going from 4 to 3 nodes doesn't change expected votes or quorum because losing a node in a 4-node cluster wouldn't cause the cluster to lose quorum in the first place.

If you take another node out of that cluster (which without 'leave remove' would leave it inquorate) you should see:

Nodes: 2
Expected votes: 2
Total votes: 2
Quorum: 2

Comment 7 Christine Caulfield 2009-07-28 07:50:12 UTC
While I was testing this I spotted that 'cman_tool leave remove' still doesn't work if there are no services running (eg fenced). So if you are testing using something like

# cman_tool join
# cman_tool leave remove

Then you will still see it fail.

There is a patch in STABLE3 to fix this. I'll commit it for 5.5

Comment 11 errata-xmlrpc 2009-09-02 11:06:30 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1341.html