RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 606989 - cman expected vote does not drop when removing node from cluster.
Summary: cman expected vote does not drop when removing node from cluster.
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: cluster
Version: 6.0
Hardware: All
OS: Linux
medium
high
Target Milestone: rc
: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
URL:
Whiteboard:
: 616381 (view as bug list)
Depends On:
Blocks: 599016
TreeView+ depends on / blocked
 
Reported: 2010-06-22 21:07 UTC by Dean Jansa
Modified: 2010-11-10 19:59 UTC (History)
6 users (show)

Fixed In Version: cluster-3.0.12-15.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-11-10 19:59:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch to fix (1.07 KB, patch)
2010-06-23 09:30 UTC, Christine Caulfield
no flags Details | Diff
Patch to recalculate quorum (308 bytes, patch)
2010-07-14 12:48 UTC, Christine Caulfield
no flags Details | Diff
Recalculate quorum on quorum device vote changes (1.28 KB, patch)
2010-07-21 13:20 UTC, Lon Hohberger
no flags Details | Diff

Description Dean Jansa 2010-06-22 21:07:57 UTC
Description of problem:

Following the instructions to remove a node from a cluster without needing a cluster reboot I noticed that the expected votes (and quorum) do not drop as you remove nodes.


Version-Release number of selected component (if applicable):

cman-3.0.12-6.el6.x86_64
RHEL6.0-20100615.0-Server

How reproducible:
Every time


# Starting with a 5 node cluster:

[root@marathon-05 ~]# cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   M    284   2010-06-22 15:52:43  marathon-01
   2   M    284   2010-06-22 15:52:43  marathon-02
   3   M    284   2010-06-22 15:52:43  marathon-03
   4   M    284   2010-06-22 15:52:43  marathon-04
   5   M    284   2010-06-22 15:52:43  marathon-05

[root@marathon-05 ~]# cman_tool status
Version: 6.2.0
Config Version: 1
Cluster Name: marathon
Cluster Id: 20778
Cluster Member: Yes
Cluster Generation: 284
Membership state: Cluster-Member
Nodes: 5
Expected votes: 5
Total votes: 5
Node votes: 1
Quorum: 3  
Active subsystems: 7
Flags: 
Ports Bound: 0  
Node name: marathon-05
Node ID: 5
Multicast addresses: 239.192.81.123 
Node addresses: 10.15.89.75 


# Remove marathon-01 from cluster.conf

[root@marathon-05 ~]# vi /etc/cluster/cluster.conf 

# Distribute cluster.conf to remaining nodes

[root@marathon-05 ~]# for m in marathon-0{1,2,3,4}
> do
> qacp /etc/cluster/cluster.conf root@${m}:/etc/cluster/cluster.conf
> done
/etc/cluster/cluster.conf       -> marathon-01:/etc/cluster/cluster.conf
/etc/cluster/cluster.conf       -> marathon-02:/etc/cluster/cluster.conf
/etc/cluster/cluster.conf       -> marathon-03:/etc/cluster/cluster.conf
/etc/cluster/cluster.conf       -> marathon-04:/etc/cluster/cluster.conf



# Remove marathon-01 from cluster

[root@marathon-01 ~]# service cman stop
Stopping cluster: 
   Leaving fence domain...                                 [  OK  ]
   Stopping gfs_controld...                                [  OK  ]
   Stopping dlm_controld...                                [  OK  ]
   Stopping fenced...                                      [  OK  ]
   Stopping cman...                                        [  OK  ]
   Waiting for corosync to shutdown:                       [  OK  ]
   Unloading kernel modules...                             [  OK  ]
   Unmounting configfs...                                  [  OK  ]



# Have cman re-read the config file

[root@marathon-05 ~]# cman_tool version -r0


# All nodes now show:
[root@marathon-05 ~]# cman_tool nodes
Node  Sts   Inc   Joined               Name
   2   M    284   2010-06-22 15:52:43  marathon-02
   3   M    284   2010-06-22 15:52:43  marathon-03
   4   M    284   2010-06-22 15:52:43  marathon-04
   5   M    284   2010-06-22 15:52:43  marathon-05


But -- the expected votes has not dropped (nor quorum):

[root@marathon-05 ~]# cman_tool status
Version: 6.2.0
Config Version: 1
Cluster Name: marathon
Cluster Id: 20778
Cluster Member: Yes
Cluster Generation: 288
Membership state: Cluster-Member
Nodes: 4
Expected votes: 5
Total votes: 4
Node votes: 1
Quorum: 3

Comment 2 Dean Jansa 2010-06-22 21:23:08 UTC
If I restart the cluster expected votes drops as expected:

[root@marathon-05 ~]# service cman stop
Stopping cluster: 
   Leaving fence domain... s                               [  OK  ]
   Stopping gfs_controld... ervi                           [  OK  ]
   Stopping dlm_controld... ce                             [  OK  ]
   Stopping fenced... c                                    [  OK  ]
ma   Stopping cman... n sta                                [  OK  ]
   Waiting for corosync to shutdown:                       [  OK  ]
   Unloading kernel modules... r                           [  OK  ]
   Unmounting configfs...                                  [  OK  ]
[root@marathon-05 ~]# service cman start
Starting cluster: 
   Checking Network Manager...                             [  OK  ]
   Global setup...                                         [  OK  ]
   Loading kernel modules...                               [  OK  ]
   Mounting configfs...                                    [  OK  ]
   Starting cman...                                        [  OK  ]
   Waiting for quorum...                                   [  OK  ]
   Starting fenced...                                      [  OK  ]
   Starting dlm_controld...                                [  OK  ]
   Starting gfs_controld...                                [  OK  ]
   Unfencing self...                                       [  OK  ]
   Joining fence domain...                                 [  OK  ]
[root@marathon-05 ~]# cman_tool status
Version: 6.2.0
Config Version: 1
Cluster Name: marathon
Cluster Id: 20778
Cluster Member: Yes
Cluster Generation: 312
Membership state: Cluster-Member
Nodes: 4
Expected votes: 4
Total votes: 4
Node votes: 1
Quorum: 3

Comment 3 Christine Caulfield 2010-06-23 09:30:42 UTC
Created attachment 426207 [details]
Patch to fix

You're right, there is some code missing from the reload routine. Here it is

Comment 4 Christine Caulfield 2010-06-23 09:34:15 UTC
This patch is now in STABLE3 git:

commit e95deaf87607f483f4066e2cbc105ffa725ddd05
Author: Christine Caulfield <ccaulfie>
Date:   Wed Jun 23 10:28:33 2010 +0100

    cman: Recalculate expected_votes on a config reload.

Comment 9 Dean Jansa 2010-07-13 15:35:50 UTC
[root@marathon-01 ~]# cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   M     32   2010-07-13 10:00:45  marathon-01
   2   M     32   2010-07-13 10:00:45  marathon-02
   3   M     36   2010-07-13 10:00:45  marathon-03
   5   M     40   2010-07-13 10:00:46  marathon-05

# Remove marathon-05 from cluster.conf
[root@marathon-01 ~]# vi /etc/cluster/cluster.conf 

# Distribute cluster.conf
[root@marathon-01 ~]# for m in marathon-0{2,3}
> do
> qacp /etc/cluster/cluster.conf root@${m}:/etc/cluster/cluster.conf
> done
/etc/cluster/cluster.conf       -> marathon-02:/etc/cluster/cluster.conf
/etc/cluster/cluster.conf       -> marathon-03:/etc/cluster/cluster.conf

# Remove marathon-5 from cluster
[root@marathon-05 ~]# service cman stop
Stopping cluster: 
   Leaving fence domain...                                 [  OK  ]
   Stopping gfs_controld...                                [  OK  ]
   Stopping dlm_controld...                                [  OK  ]
   Stopping fenced...                                      [  OK  ]
   Stopping cman...                                        [  OK  ]
   Waiting for corosync to shutdown:                       [  OK  ]
   Unloading kernel modules...                             [  OK  ]
   Unmounting configfs...                                  [  OK  ]

# Have cman re-read the config file
[root@marathon-01 ~]# cman_tool version -r0

*******  I had to run cman_tool version -r0 on ALL nodes, otherwise nodes which didn't run the cman_tool version -r0 did not update. *******

# Verify only 3 nodes
[root@marathon-01 ~]# cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   M     32   2010-07-13 10:00:45  marathon-01
   2   M     32   2010-07-13 10:00:45  marathon-02
   3   M     36   2010-07-13 10:00:45  marathon-03

# Verify nodes, votes and quorum drop
[root@marathon-01 ~]# cman_tool status
Version: 6.2.0
Config Version: 1
Cluster Name: marathon
Cluster Id: 20778
Cluster Member: Yes
Cluster Generation: 44
Membership state: Cluster-Member
Nodes: 3
Expected votes: 3
Total votes: 3
Node votes: 1
Quorum: 3

Comment 11 Dean Jansa 2010-07-13 15:54:11 UTC
FailsQA --

Quorum count is not dropped as shown in comment9.  Should be 2.

Comment 12 Dean Jansa 2010-07-13 15:55:16 UTC
FailsQA --

Quorum count is not dropped as shown in comment9.  Should be 2.

Comment 17 Christine Caulfield 2010-07-14 12:48:49 UTC
Created attachment 431755 [details]
Patch to recalculate quorum

This additional (untested) patch will tell cman to recalculate quorum when the configuration is reloaded.

Comment 18 Lon Hohberger 2010-07-14 15:54:41 UTC
The patch didn't work the way Dean expected I think.

With recalculate_quorum(1,0) (instead of 0,0 as in the patch), the patch seems to work fine.  The question is whether this is something we -want- or if there's somehow that it might be dangerous for users (such that we want them to manually decrease expected votes).

Here was the result after removing 1 node from a 4 node cluster in the config with recalculate_quorum(1,0):

[root@marathon-01 ~]# cman_tool status
Version: 6.2.0
Config Version: 3
Cluster Name: marathon
Cluster Id: 20778
Cluster Member: Yes
Cluster Generation: 632
Membership state: Cluster-Member
Nodes: 3
Expected votes: 4
Total votes: 3
Node votes: 1
Quorum: 3  
Active subsystems: 7
Flags: 
Ports Bound: 0  
Node name: marathon-01
Node ID: 1
Multicast addresses: 239.192.81.123 
Node addresses: 10.15.89.71 
[root@marathon-01 ~]# cman_tool version -r0
[root@marathon-01 ~]# cman_tool status
Version: 6.2.0
Config Version: 4
Cluster Name: marathon
Cluster Id: 20778
Cluster Member: Yes
Cluster Generation: 632
Membership state: Cluster-Member
Nodes: 3
Expected votes: 3
Total votes: 3
Node votes: 1
Quorum: 2  
Active subsystems: 7
Flags: 
Ports Bound: 0  
Node name: marathon-01
Node ID: 1
Multicast addresses: 239.192.81.123 
Node addresses: 10.15.89.71

Comment 23 Fabio Massimo Di Nitto 2010-07-21 08:17:33 UTC
*** Bug 616381 has been marked as a duplicate of this bug. ***

Comment 25 Lon Hohberger 2010-07-21 13:20:35 UTC
Created attachment 433409 [details]
Recalculate quorum on quorum device vote changes

This patch allows cman to recalculate quorum when the quorum device votes change if and only if the quorum device was currently a participating member.

Comment 26 Fabio Massimo Di Nitto 2010-07-21 13:33:07 UTC
the new patch seems to do the job for me.

Comment 30 Steven Dake 2010-07-22 19:38:45 UTC
*** Bug 616095 has been marked as a duplicate of this bug. ***

Comment 33 Dean Jansa 2010-07-29 19:51:26 UTC
Verified RHEL6.0-20100728.2 tree.
pruner passes, 5 node -> 3 node and back.

Comment 34 releng-rhel@redhat.com 2010-11-10 19:59:27 UTC
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.