Bug 1121769
| Summary: | Need support for cluster options, RRP & quorum options in pcs for RHEL6 | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Chris Feist <cfeist> | ||||
| Component: | pcs | Assignee: | Tomas Jelinek <tojeline> | ||||
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 6.7 | CC: | ccaulfie, cluster-maint, malcolm.j.cowe, rsteiger, tojeline | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | pcs-0.9.138-1.el6 | Doc Type: | Bug Fix | ||||
| Doc Text: |
* This update adds support for configuring the Redundant Ring Protocol (RRP) and setting Corosync options. The user can now configure a cluster with RRP and set up corosync options. (BZ#1121769)
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2015-07-22 06:15:33 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Chris Feist
2014-07-21 20:19:09 UTC
There's no point in adding the new quorum options to RHEL6 as they don't exist in cman - which RHEL6 clustering still uses. Created attachment 966830 [details]
proposed fix
Options not supported by cman:
--wait_for_all --auto_tie_breaker --last_man_standing --last_man_standing_window --ipv6 --token_coefficient
Before Fix:
[root@rh66-node1 ~]# rpm -q pcs
pcs-0.9.123-9.el6.x86_64
[root@rh66-node1:~]# pcs cluster setup --start --name myCluster rh66-node1 rh66-node2 rh66-node3 --transport udp --addr0 192.168.122.0 --addr1 192.168.123.0 --token 1001 --join 51 --consensus 1201 --miss_count_const 6 --fail_recv_const 2501
rh66-node1: Updated cluster.conf...
rh66-node2: Updated cluster.conf...
rh66-node3: Updated cluster.conf...
Starting cluster on nodes: rh66-node1, rh66-node2, rh66-node3...
rh66-node1: Starting Cluster...
rh66-node2: Starting Cluster...
rh66-node3: Starting Cluster...
[root@rh66-node1:~]# cat /etc/cluster/cluster.conf
<cluster config_version="11" name="myCluster">
<fence_daemon/>
<clusternodes>
<clusternode name="rh66-node1" nodeid="1">
<fence>
<method name="pcmk-method">
<device name="pcmk-redirect" port="rh66-node1"/>
</method>
</fence>
</clusternode>
<clusternode name="rh66-node2" nodeid="2">
<fence>
<method name="pcmk-method">
<device name="pcmk-redirect" port="rh66-node2"/>
</method>
</fence>
</clusternode>
<clusternode name="rh66-node3" nodeid="3">
<fence>
<method name="pcmk-method">
<device name="pcmk-redirect" port="rh66-node3"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman/>
<fencedevices>
<fencedevice agent="fence_pcmk" name="pcmk-redirect"/>
</fencedevices>
<rm>
<failoverdomains/>
<resources/>
</rm>
</cluster>
[root@rh66-node1:~]# corosync-objctl totem
totem.transport=udp
totem.version=2
totem.nodeid=1
totem.vsftype=none
totem.token=10000
totem.join=60
totem.fail_recv_const=2500
totem.consensus=12000
totem.rrp_mode=none
totem.secauth=1
totem.key=myCluster
totem.interface.ringnumber=0
totem.interface.bindnetaddr=192.168.122.61
totem.interface.mcastaddr=239.192.113.118
totem.interface.mcastport=5405
[root@rh66-node1:~]# corosync-objctl | grep members
runtime.totem.pg.mrp.srp.members.1.ip=r(0) ip(192.168.122.61)
runtime.totem.pg.mrp.srp.members.1.join_count=1
runtime.totem.pg.mrp.srp.members.1.status=joined
runtime.totem.pg.mrp.srp.members.2.ip=r(0) ip(192.168.122.62)
runtime.totem.pg.mrp.srp.members.2.join_count=1
runtime.totem.pg.mrp.srp.members.2.status=joined
runtime.totem.pg.mrp.srp.members.3.ip=r(0) ip(192.168.122.63)
runtime.totem.pg.mrp.srp.members.3.join_count=1
runtime.totem.pg.mrp.srp.members.3.status=joined
[root@rh66-node1:~]# pcs cluster destroy --all
rh66-node1: Successfully destroyed cluster
rh66-node2: Successfully destroyed cluster
rh66-node3: Successfully destroyed cluster
[root@rh66-node1:~]# pcs cluster setup --start --name myCluster rh66-node1,192.168.123.61 rh66-node2,192.168.123.62 rh66-node3,192.168.123.63 --transport udpu
rh66-node1: Updated cluster.conf...
rh66-node2: Updated cluster.conf...
rh66-node3: Updated cluster.conf...
Starting cluster on nodes: rh66-node1, rh66-node2, rh66-node3...
rh66-node1: Starting Cluster...
Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... Cannot find node name in cluster.conf
Unable to get the configuration
Cannot find node name in cluster.conf
cman_tool: corosync daemon didn't start Check cluster logs for details
[FAILED]
Stopping cluster:
Leaving fence domain... [ OK ]
Stopping gfs_controld... [ OK ]
Stopping dlm_controld... [ OK ]
Stopping fenced... [ OK ]
Stopping cman... [ OK ]
Unloading kernel modules... [ OK ]
Unmounting configfs... [ OK ]
rh66-node2: Starting Cluster...
Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... Cannot find node name in cluster.conf
Unable to get the configuration
Cannot find node name in cluster.conf
cman_tool: corosync daemon didn't start Check cluster logs for details
[FAILED]
Stopping cluster:
Leaving fence domain... [ OK ]
Stopping gfs_controld... [ OK ]
Stopping dlm_controld... [ OK ]
Stopping fenced... [ OK ]
Stopping cman... [ OK ]
Unloading kernel modules... [ OK ]
Unmounting configfs... [ OK ]
rh66-node3: Starting Cluster...
Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... Cannot find node name in cluster.conf
Unable to get the configuration
Cannot find node name in cluster.conf
cman_tool: corosync daemon didn't start Check cluster logs for details
[FAILED]
Stopping cluster:
Leaving fence domain... [ OK ]
Stopping gfs_controld... [ OK ]
Stopping dlm_controld... [ OK ]
Stopping fenced... [ OK ]
Stopping cman... [ OK ]
Unloading kernel modules... [ OK ]
Unmounting configfs... [ OK ]
[root@rh66-node1:~]# cat /etc/cluster/cluster.conf
<cluster config_version="11" name="myCluster">
<fence_daemon/>
<clusternodes>
<clusternode name="rh66-node1,192.168.123.61" nodeid="1">
<fence>
<method name="pcmk-method">
<device name="pcmk-redirect" port="rh66-node1,192.168.123.61"/>
</method>
</fence>
</clusternode>
<clusternode name="rh66-node2,192.168.123.62" nodeid="2">
<fence>
<method name="pcmk-method">
<device name="pcmk-redirect" port="rh66-node2,192.168.123.62"/>
</method>
</fence>
</clusternode>
<clusternode name="rh66-node3,192.168.123.63" nodeid="3">
<fence>
<method name="pcmk-method">
<device name="pcmk-redirect" port="rh66-node3,192.168.123.63"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman/>
<fencedevices>
<fencedevice agent="fence_pcmk" name="pcmk-redirect"/>
</fencedevices>
<rm>
<failoverdomains/>
<resources/>
</rm>
</cluster>
After Fix:
[root@rh66-node1:~]# rpm -q pcs
pcs-0.9.138-1.el6.x86_64
[root@rh66-node1:~]# pcs cluster setup --start --name myCluster rh66-node1 rh66-node2 rh66-node3 --transport udp --addr0 192.168.122.0 --addr1 192.168.123.0 --token 1001 --join 51 --consensus 1201 --miss_count_const 6 --fail_recv_const 2501
rh66-node1: Updated cluster.conf...
rh66-node2: Updated cluster.conf...
rh66-node3: Updated cluster.conf...
Starting cluster on nodes: rh66-node1, rh66-node2, rh66-node3...
rh66-node1: Starting Cluster...
rh66-node3: Starting Cluster...
rh66-node2: Starting Cluster...
[root@rh66-node1:~]# cat /etc/cluster/cluster.conf
<cluster config_version="18" name="myCluster">
<fence_daemon/>
<clusternodes>
<clusternode name="rh66-node1" nodeid="1">
<altname name="192.168.123.0"/>
<fence>
<method name="pcmk-method">
<device name="pcmk-redirect" port="rh66-node1"/>
</method>
</fence>
</clusternode>
<clusternode name="rh66-node2" nodeid="2">
<altname name="192.168.123.0"/>
<fence>
<method name="pcmk-method">
<device name="pcmk-redirect" port="rh66-node2"/>
</method>
</fence>
</clusternode>
<clusternode name="rh66-node3" nodeid="3">
<altname name="192.168.123.0"/>
<fence>
<method name="pcmk-method">
<device name="pcmk-redirect" port="rh66-node3"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman broadcast="no" transport="udp">
<multicast addr="239.255.1.1"/>
<altmulticast addr="239.255.2.1"/>
</cman>
<fencedevices>
<fencedevice agent="fence_pcmk" name="pcmk-redirect"/>
</fencedevices>
<rm>
<failoverdomains/>
<resources/>
</rm>
<totem consensus="1201" fail_recv_const="2501" join="51" miss_count_const="6" rrp_mode="passive" token="1001"/>
</cluster>
[root@rh66-node1:~]# corosync-objctl totem
totem.consensus=1201
totem.fail_recv_const=2501
totem.join=51
totem.miss_count_const=6
totem.rrp_mode=passive
totem.token=1001
totem.transport=udp
totem.version=2
totem.nodeid=1
totem.vsftype=none
totem.rrp_problem_count_threshold=3
totem.secauth=1
totem.key=myCluster
totem.interface.ringnumber=0
totem.interface.bindnetaddr=192.168.122.61
totem.interface.mcastaddr=239.255.1.1
totem.interface.mcastport=5405
totem.interface.ringnumber=1
totem.interface.bindnetaddr=192.168.123.0
totem.interface.mcastaddr=239.255.2.1
totem.interface.mcastport=5405
[root@rh66-node1:~]# corosync-objctl | grep members
runtime.totem.pg.mrp.srp.members.1.ip=r(0) ip(192.168.122.61) r(1) ip(192.168.123.61)
runtime.totem.pg.mrp.srp.members.1.join_count=1
runtime.totem.pg.mrp.srp.members.1.status=joined
runtime.totem.pg.mrp.srp.members.2.ip=r(0) ip(192.168.122.62) r(1) ip(192.168.123.62)
runtime.totem.pg.mrp.srp.members.2.join_count=1
runtime.totem.pg.mrp.srp.members.2.status=joined
runtime.totem.pg.mrp.srp.members.3.ip=r(0) ip(192.168.122.63) r(1) ip(192.168.123.63)
runtime.totem.pg.mrp.srp.members.3.join_count=1
runtime.totem.pg.mrp.srp.members.3.status=joined
[root@rh66-node1:~]# pcs cluster setup --start --name myCluster rh66-node1,192.168.123.61 rh66-node2,192.168.123.62 rh66-node3,192.168.123.63 --transport udpu
rh66-node1: Updated cluster.conf...
rh66-node2: Updated cluster.conf...
rh66-node3: Updated cluster.conf...
Starting cluster on nodes: rh66-node1, rh66-node2, rh66-node3...
rh66-node1: Starting Cluster...
rh66-node3: Starting Cluster...
rh66-node2: Starting Cluster...
[root@rh66-node1:~]# cat /etc/cluster/cluster.conf
<cluster config_version="16" name="myCluster">
<fence_daemon/>
<clusternodes>
<clusternode name="rh66-node1" nodeid="1">
<altname name="192.168.123.61"/>
<fence>
<method name="pcmk-method">
<device name="pcmk-redirect" port="rh66-node1"/>
</method>
</fence>
</clusternode>
<clusternode name="rh66-node2" nodeid="2">
<altname name="192.168.123.62"/>
<fence>
<method name="pcmk-method">
<device name="pcmk-redirect" port="rh66-node2"/>
</method>
</fence>
</clusternode>
<clusternode name="rh66-node3" nodeid="3">
<altname name="192.168.123.63"/>
<fence>
<method name="pcmk-method">
<device name="pcmk-redirect" port="rh66-node3"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman broadcast="no" transport="udpu"/>
<fencedevices>
<fencedevice agent="fence_pcmk" name="pcmk-redirect"/>
</fencedevices>
<rm>
<failoverdomains/>
<resources/>
</rm>
<totem rrp_mode="passive"/>
</cluster>
[root@rh66-node1:~]# corosync-objctl totem
totem.rrp_mode=passive
totem.transport=udpu
totem.version=2
totem.nodeid=1
totem.vsftype=none
totem.token=10000
totem.join=60
totem.fail_recv_const=2500
totem.consensus=12000
totem.rrp_problem_count_threshold=3
totem.secauth=1
totem.key=myCluster
totem.interface.ringnumber=0
totem.interface.bindnetaddr=192.168.122.61
totem.interface.mcastaddr=239.192.113.118
totem.interface.mcastport=5405
totem.interface.member.memberaddr=rh66-node1
totem.interface.member.memberaddr=rh66-node2
totem.interface.member.memberaddr=rh66-node3
totem.interface.ringnumber=1
totem.interface.bindnetaddr=192.168.123.61
totem.interface.mcastaddr=239.192.113.119
totem.interface.mcastport=5405
totem.interface.member.memberaddr=192.168.123.61
totem.interface.member.memberaddr=192.168.123.62
totem.interface.member.memberaddr=192.168.123.63
[root@rh66-node1:~]# corosync-objctl | grep members
runtime.totem.pg.mrp.srp.members.1.ip=r(0) ip(192.168.122.61) r(1) ip(192.168.123.61)
runtime.totem.pg.mrp.srp.members.1.join_count=1
runtime.totem.pg.mrp.srp.members.1.status=joined
runtime.totem.pg.mrp.srp.members.2.ip=r(0) ip(192.168.122.62) r(1) ip(192.168.123.62)
runtime.totem.pg.mrp.srp.members.2.join_count=1
runtime.totem.pg.mrp.srp.members.2.status=joined
runtime.totem.pg.mrp.srp.members.3.ip=r(0) ip(192.168.122.63) r(1) ip(192.168.123.63)
runtime.totem.pg.mrp.srp.members.3.join_count=1
runtime.totem.pg.mrp.srp.members.3.status=joined
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-1446.html |