Bug 1121769 - Need support for cluster options, RRP & quorum options in pcs for RHEL6
Summary: Need support for cluster options, RRP & quorum options in pcs for RHEL6
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: pcs
Version: 6.7
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Tomas Jelinek
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-21 20:19 UTC by Chris Feist
Modified: 2015-07-22 06:15 UTC (History)
5 users (show)

Fixed In Version: pcs-0.9.138-1.el6
Doc Type: Bug Fix
Doc Text:
* This update adds support for configuring the Redundant Ring Protocol (RRP) and setting Corosync options. The user can now configure a cluster with RRP and set up corosync options. (BZ#1121769)
Clone Of:
Environment:
Last Closed: 2015-07-22 06:15:33 UTC


Attachments (Terms of Use)
proposed fix (64.81 KB, patch)
2014-12-10 14:16 UTC, Tomas Jelinek
no flags Details | Diff


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:1446 normal SHIPPED_LIVE pcs bug fix and enhancement update 2015-07-20 18:43:57 UTC

Description Chris Feist 2014-07-21 20:19:09 UTC
Need ability to configure cluster with options similiar to RHEL 7

Need these options to work in RHEL6
--token/--join/--consensus/--miss_count_const/--fail_recv_const

Need ability to configure RRP (or warning message if not available)

Need quorum options configuration
--wait_for_all/--auto_tie_breaker/--last_man_standing/--last_man_standing_window

Comment 1 Christine Caulfield 2014-07-22 08:00:40 UTC
There's no point in adding the new quorum options to RHEL6 as they don't exist in cman - which RHEL6 clustering still uses.

Comment 2 Tomas Jelinek 2014-12-10 14:16:17 UTC
Created attachment 966830 [details]
proposed fix

Options not supported by cman:
--wait_for_all --auto_tie_breaker --last_man_standing --last_man_standing_window --ipv6 --token_coefficient

Comment 3 Tomas Jelinek 2015-01-27 14:00:29 UTC
Before Fix:
[root@rh66-node1 ~]# rpm -q pcs
pcs-0.9.123-9.el6.x86_64

[root@rh66-node1:~]# pcs cluster setup --start --name myCluster rh66-node1 rh66-node2 rh66-node3 --transport udp --addr0 192.168.122.0 --addr1 192.168.123.0 --token 1001 --join 51 --consensus 1201 --miss_count_const 6 --fail_recv_const 2501
rh66-node1: Updated cluster.conf...
rh66-node2: Updated cluster.conf...
rh66-node3: Updated cluster.conf...
Starting cluster on nodes: rh66-node1, rh66-node2, rh66-node3...
rh66-node1: Starting Cluster...
rh66-node2: Starting Cluster...
rh66-node3: Starting Cluster...
[root@rh66-node1:~]# cat /etc/cluster/cluster.conf
<cluster config_version="11" name="myCluster">
  <fence_daemon/>
  <clusternodes>
    <clusternode name="rh66-node1" nodeid="1">
      <fence>
        <method name="pcmk-method">
          <device name="pcmk-redirect" port="rh66-node1"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="rh66-node2" nodeid="2">
      <fence>
        <method name="pcmk-method">
          <device name="pcmk-redirect" port="rh66-node2"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="rh66-node3" nodeid="3">
      <fence>
        <method name="pcmk-method">
          <device name="pcmk-redirect" port="rh66-node3"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <cman/>
  <fencedevices>
    <fencedevice agent="fence_pcmk" name="pcmk-redirect"/>
  </fencedevices>
  <rm>
    <failoverdomains/>
    <resources/>
  </rm>
</cluster>
[root@rh66-node1:~]# corosync-objctl totem
totem.transport=udp
totem.version=2
totem.nodeid=1
totem.vsftype=none
totem.token=10000
totem.join=60
totem.fail_recv_const=2500
totem.consensus=12000
totem.rrp_mode=none
totem.secauth=1
totem.key=myCluster
totem.interface.ringnumber=0
totem.interface.bindnetaddr=192.168.122.61
totem.interface.mcastaddr=239.192.113.118
totem.interface.mcastport=5405
[root@rh66-node1:~]# corosync-objctl | grep members
runtime.totem.pg.mrp.srp.members.1.ip=r(0) ip(192.168.122.61) 
runtime.totem.pg.mrp.srp.members.1.join_count=1
runtime.totem.pg.mrp.srp.members.1.status=joined
runtime.totem.pg.mrp.srp.members.2.ip=r(0) ip(192.168.122.62) 
runtime.totem.pg.mrp.srp.members.2.join_count=1
runtime.totem.pg.mrp.srp.members.2.status=joined
runtime.totem.pg.mrp.srp.members.3.ip=r(0) ip(192.168.122.63) 
runtime.totem.pg.mrp.srp.members.3.join_count=1
runtime.totem.pg.mrp.srp.members.3.status=joined
[root@rh66-node1:~]# pcs cluster destroy --all
rh66-node1: Successfully destroyed cluster
rh66-node2: Successfully destroyed cluster
rh66-node3: Successfully destroyed cluster


[root@rh66-node1:~]# pcs cluster setup --start --name myCluster rh66-node1,192.168.123.61 rh66-node2,192.168.123.62 rh66-node3,192.168.123.63 --transport udpu
rh66-node1: Updated cluster.conf...
rh66-node2: Updated cluster.conf...
rh66-node3: Updated cluster.conf...
Starting cluster on nodes: rh66-node1, rh66-node2, rh66-node3...
rh66-node1: Starting Cluster...
Starting cluster: 
   Checking if cluster has been disabled at boot... [  OK  ]
   Checking Network Manager... [  OK  ]
   Global setup... [  OK  ]
   Loading kernel modules... [  OK  ]
   Mounting configfs... [  OK  ]
   Starting cman... Cannot find node name in cluster.conf
Unable to get the configuration
Cannot find node name in cluster.conf
cman_tool: corosync daemon didn't start Check cluster logs for details
[FAILED]
Stopping cluster: 
   Leaving fence domain... [  OK  ]
   Stopping gfs_controld... [  OK  ]
   Stopping dlm_controld... [  OK  ]
   Stopping fenced... [  OK  ]
   Stopping cman... [  OK  ]
   Unloading kernel modules... [  OK  ]
   Unmounting configfs... [  OK  ]
rh66-node2: Starting Cluster...
Starting cluster: 
   Checking if cluster has been disabled at boot... [  OK  ]
   Checking Network Manager... [  OK  ]
   Global setup... [  OK  ]
   Loading kernel modules... [  OK  ]
   Mounting configfs... [  OK  ]
   Starting cman... Cannot find node name in cluster.conf
Unable to get the configuration
Cannot find node name in cluster.conf
cman_tool: corosync daemon didn't start Check cluster logs for details
[FAILED]
Stopping cluster: 
   Leaving fence domain... [  OK  ]
   Stopping gfs_controld... [  OK  ]
   Stopping dlm_controld... [  OK  ]
   Stopping fenced... [  OK  ]
   Stopping cman... [  OK  ]
   Unloading kernel modules... [  OK  ]
   Unmounting configfs... [  OK  ]
rh66-node3: Starting Cluster...
Starting cluster: 
   Checking if cluster has been disabled at boot... [  OK  ]
   Checking Network Manager... [  OK  ]
   Global setup... [  OK  ]
   Loading kernel modules... [  OK  ]
   Mounting configfs... [  OK  ]
   Starting cman... Cannot find node name in cluster.conf
Unable to get the configuration
Cannot find node name in cluster.conf
cman_tool: corosync daemon didn't start Check cluster logs for details
[FAILED]
Stopping cluster: 
   Leaving fence domain... [  OK  ]
   Stopping gfs_controld... [  OK  ]
   Stopping dlm_controld... [  OK  ]
   Stopping fenced... [  OK  ]
   Stopping cman... [  OK  ]
   Unloading kernel modules... [  OK  ]
   Unmounting configfs... [  OK  ]
[root@rh66-node1:~]# cat /etc/cluster/cluster.conf
<cluster config_version="11" name="myCluster">
  <fence_daemon/>
  <clusternodes>
    <clusternode name="rh66-node1,192.168.123.61" nodeid="1">
      <fence>
        <method name="pcmk-method">
          <device name="pcmk-redirect" port="rh66-node1,192.168.123.61"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="rh66-node2,192.168.123.62" nodeid="2">
      <fence>
        <method name="pcmk-method">
          <device name="pcmk-redirect" port="rh66-node2,192.168.123.62"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="rh66-node3,192.168.123.63" nodeid="3">
      <fence>
        <method name="pcmk-method">
          <device name="pcmk-redirect" port="rh66-node3,192.168.123.63"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <cman/>
  <fencedevices>
    <fencedevice agent="fence_pcmk" name="pcmk-redirect"/>
  </fencedevices>
  <rm>
    <failoverdomains/>
    <resources/>
  </rm>
</cluster>



After Fix:
[root@rh66-node1:~]# rpm -q pcs
pcs-0.9.138-1.el6.x86_64

[root@rh66-node1:~]# pcs cluster setup --start --name myCluster rh66-node1 rh66-node2 rh66-node3 --transport udp --addr0 192.168.122.0 --addr1 192.168.123.0 --token 1001 --join 51 --consensus 1201 --miss_count_const 6 --fail_recv_const 2501
rh66-node1: Updated cluster.conf...
rh66-node2: Updated cluster.conf...
rh66-node3: Updated cluster.conf...
Starting cluster on nodes: rh66-node1, rh66-node2, rh66-node3...
rh66-node1: Starting Cluster...
rh66-node3: Starting Cluster...
rh66-node2: Starting Cluster...
[root@rh66-node1:~]# cat /etc/cluster/cluster.conf
<cluster config_version="18" name="myCluster">
  <fence_daemon/>
  <clusternodes>
    <clusternode name="rh66-node1" nodeid="1">
      <altname name="192.168.123.0"/>
      <fence>
        <method name="pcmk-method">
          <device name="pcmk-redirect" port="rh66-node1"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="rh66-node2" nodeid="2">
      <altname name="192.168.123.0"/>
      <fence>
        <method name="pcmk-method">
          <device name="pcmk-redirect" port="rh66-node2"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="rh66-node3" nodeid="3">
      <altname name="192.168.123.0"/>
      <fence>
        <method name="pcmk-method">
          <device name="pcmk-redirect" port="rh66-node3"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <cman broadcast="no" transport="udp">
    <multicast addr="239.255.1.1"/>
    <altmulticast addr="239.255.2.1"/>
  </cman>
  <fencedevices>
    <fencedevice agent="fence_pcmk" name="pcmk-redirect"/>
  </fencedevices>
  <rm>
    <failoverdomains/>
    <resources/>
  </rm>
  <totem consensus="1201" fail_recv_const="2501" join="51" miss_count_const="6" rrp_mode="passive" token="1001"/>
</cluster>
[root@rh66-node1:~]# corosync-objctl totem
totem.consensus=1201
totem.fail_recv_const=2501
totem.join=51
totem.miss_count_const=6
totem.rrp_mode=passive
totem.token=1001
totem.transport=udp
totem.version=2
totem.nodeid=1
totem.vsftype=none
totem.rrp_problem_count_threshold=3
totem.secauth=1
totem.key=myCluster
totem.interface.ringnumber=0
totem.interface.bindnetaddr=192.168.122.61
totem.interface.mcastaddr=239.255.1.1
totem.interface.mcastport=5405
totem.interface.ringnumber=1
totem.interface.bindnetaddr=192.168.123.0
totem.interface.mcastaddr=239.255.2.1
totem.interface.mcastport=5405
[root@rh66-node1:~]# corosync-objctl | grep members
runtime.totem.pg.mrp.srp.members.1.ip=r(0) ip(192.168.122.61) r(1) ip(192.168.123.61) 
runtime.totem.pg.mrp.srp.members.1.join_count=1
runtime.totem.pg.mrp.srp.members.1.status=joined
runtime.totem.pg.mrp.srp.members.2.ip=r(0) ip(192.168.122.62) r(1) ip(192.168.123.62) 
runtime.totem.pg.mrp.srp.members.2.join_count=1
runtime.totem.pg.mrp.srp.members.2.status=joined
runtime.totem.pg.mrp.srp.members.3.ip=r(0) ip(192.168.122.63) r(1) ip(192.168.123.63) 
runtime.totem.pg.mrp.srp.members.3.join_count=1
runtime.totem.pg.mrp.srp.members.3.status=joined


[root@rh66-node1:~]# pcs cluster setup --start --name myCluster rh66-node1,192.168.123.61 rh66-node2,192.168.123.62 rh66-node3,192.168.123.63 --transport udpu
rh66-node1: Updated cluster.conf...
rh66-node2: Updated cluster.conf...
rh66-node3: Updated cluster.conf...
Starting cluster on nodes: rh66-node1, rh66-node2, rh66-node3...
rh66-node1: Starting Cluster...
rh66-node3: Starting Cluster...
rh66-node2: Starting Cluster...
[root@rh66-node1:~]# cat /etc/cluster/cluster.conf
<cluster config_version="16" name="myCluster">
  <fence_daemon/>
  <clusternodes>
    <clusternode name="rh66-node1" nodeid="1">
      <altname name="192.168.123.61"/>
      <fence>
        <method name="pcmk-method">
          <device name="pcmk-redirect" port="rh66-node1"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="rh66-node2" nodeid="2">
      <altname name="192.168.123.62"/>
      <fence>
        <method name="pcmk-method">
          <device name="pcmk-redirect" port="rh66-node2"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="rh66-node3" nodeid="3">
      <altname name="192.168.123.63"/>
      <fence>
        <method name="pcmk-method">
          <device name="pcmk-redirect" port="rh66-node3"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <cman broadcast="no" transport="udpu"/>
  <fencedevices>
    <fencedevice agent="fence_pcmk" name="pcmk-redirect"/>
  </fencedevices>
  <rm>
    <failoverdomains/>
    <resources/>
  </rm>
  <totem rrp_mode="passive"/>
</cluster>
[root@rh66-node1:~]# corosync-objctl totem
totem.rrp_mode=passive
totem.transport=udpu
totem.version=2
totem.nodeid=1
totem.vsftype=none
totem.token=10000
totem.join=60
totem.fail_recv_const=2500
totem.consensus=12000
totem.rrp_problem_count_threshold=3
totem.secauth=1
totem.key=myCluster
totem.interface.ringnumber=0
totem.interface.bindnetaddr=192.168.122.61
totem.interface.mcastaddr=239.192.113.118
totem.interface.mcastport=5405
totem.interface.member.memberaddr=rh66-node1
totem.interface.member.memberaddr=rh66-node2
totem.interface.member.memberaddr=rh66-node3
totem.interface.ringnumber=1
totem.interface.bindnetaddr=192.168.123.61
totem.interface.mcastaddr=239.192.113.119
totem.interface.mcastport=5405
totem.interface.member.memberaddr=192.168.123.61
totem.interface.member.memberaddr=192.168.123.62
totem.interface.member.memberaddr=192.168.123.63
[root@rh66-node1:~]# corosync-objctl | grep members
runtime.totem.pg.mrp.srp.members.1.ip=r(0) ip(192.168.122.61) r(1) ip(192.168.123.61) 
runtime.totem.pg.mrp.srp.members.1.join_count=1
runtime.totem.pg.mrp.srp.members.1.status=joined
runtime.totem.pg.mrp.srp.members.2.ip=r(0) ip(192.168.122.62) r(1) ip(192.168.123.62) 
runtime.totem.pg.mrp.srp.members.2.join_count=1
runtime.totem.pg.mrp.srp.members.2.status=joined
runtime.totem.pg.mrp.srp.members.3.ip=r(0) ip(192.168.122.63) r(1) ip(192.168.123.63) 
runtime.totem.pg.mrp.srp.members.3.join_count=1
runtime.totem.pg.mrp.srp.members.3.status=joined

Comment 8 errata-xmlrpc 2015-07-22 06:15:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1446.html


Note You need to log in before you can comment on or make changes to this bug.