Bug 1029129
| Summary: | pcs cluster standby nodename doesn't work | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Tuomo Soini <tis> |
| Component: | pcs | Assignee: | Chris Feist <cfeist> |
| Status: | CLOSED ERRATA | QA Contact: | |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 6.4 | CC: | cluster-maint, djansa, fdinitto, jherrman, jruemker, mathieu.peltier, nyewale, redhat-bugzilla, robert.scheck, rsteiger, sbradley, xrobau |
| Target Milestone: | rc | Keywords: | Reopened, ZStream |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | pcs-0.9.101-1.el6 | Doc Type: | Bug Fix |
| Doc Text: |
Prior to this update, the pcs utility was using an incorrect location to search for cluster node names, and the "pcs cluster standby" command therefore could not find the specified cluster node. As a a consequence, it was not possible to put cluster nodes in standby mode. With this update, pcs properly searches for node names in the /etc/cluster/cluster.conf file and putting cluster nodes in standby mode works correctly.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-10-14 07:21:37 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1032159, 1032161 | ||
|
Description
Tuomo Soini
2013-11-11 17:46:41 UTC
Version: 0.9.90 so latest update. We are experiencing exactly the same issue. We configured our setup like at http://floriancrouzat.net/2013/04/rhel-6-4-pacemaker-1-1-8-adding-cman-support-and-getting-rid-of-the-plugin/ described. Since updating from pcs-0.9.26-10.el6_4.1 to pcs-0.9.90-1.0.1.el6 "pcs cluster standby $(uname -n)" fails like stated above. An strace(1) shows this for me: open("/etc/corosync/corosync.conf", O_RDONLY) = -1 ENOENT (No such file or directory) Why the heck does it need corosync.conf(5) again? Isn't ccs(1) doing the job anymore? Cross-filed case 00979407 on the Red Hat customer portal. Definitely a bug, it should be looking for the current pacemaker nodes, fixed upstream with this patch. https://github.com/feist/pcs/commit/8b888080c37ddea88b92dfd95aadd78b9db68b55 As a workaround, you can run 'crm_standby -v on -N <nodename>' to standby a node or 'crm_standby -D -N <nodename>' to unstandby a node. Confirmed. Backporting the patch to a running system does the job for us:
--- snipp ---
--- /usr/lib/python2.6/site-packages/pcs/cluster.py
+++ /usr/lib/python2.6/site-packages/pcs/cluster.py.orig
@@ -360,7 +360,7 @@
usage.cluster(["unstandby"])
sys.exit(1)
- nodes = utils.getNodesFromPacemaker()
+ nodes = utils.getNodesFromCorosyncConf()
if "--all" not in utils.pcs_options:
nodeFound = False
--- snapp ---
Any chance to get this in before RHEL 6.6 GA as GSS says? AFAIK this is now
going to be supported with RHEL 6.5 GA and all documentation out there talks
about "pcs cluster (un)standby <nodename>". Breaking this is IMHO not good.
There's lots of other places where getNodesFromCorosyncConf is incorrectly referenced. As that file does not exist on RHEL Clusters, all of these are incorrect: [root@localhost pcs]# grep getNodesFromCorosyncConf * cluster.py: sync_nodes(utils.getNodesFromCorosyncConf(),utils.getCorosyncConf()) cluster.py: auth_nodes(utils.getNodesFromCorosyncConf()) cluster.py: nodes = utils.getNodesFromCorosyncConf() cluster.py: for node in utils.getNodesFromCorosyncConf(): cluster.py: for node in utils.getNodesFromCorosyncConf(): cluster.py: nodes = utils.getNodesFromCorosyncConf() cluster.py: for node in utils.getNodesFromCorosyncConf(): cluster.py: for node in utils.getNodesFromCorosyncConf(): cluster.py: for my_node in utils.getNodesFromCorosyncConf(): cluster.py: for my_node in utils.getNodesFromCorosyncConf(): cluster.py: for node in utils.getNodesFromCorosyncConf(): status.py: corosync_nodes = utils.getNodesFromCorosyncConf() status.py: all_nodes = utils.getNodesFromCorosyncConf() utils.py:def getNodesFromCorosyncConf(): utils.py: for c_node in getNodesFromCorosyncConf(): utils.py: for c_node in getNodesFromCorosyncConf(): utils.py: c_nodes = getNodesFromCorosyncConf() I get also an error message concerning corosync.conf when running "pcs cluster status": # pcs cluster status ... PCSD Status: Error: no nodes found in corosync.conf Before Fix: [root@ask-02 test]# rpm -q pcs pcs-0.9.90-2.el6.noarch [root@ask-02 test]# pcs cluster standby ask-02 Error: node 'ask-02' does not appear to exist in configuration [root@ask-02 test]# pcs config | grep Node Corosync Nodes: Pacemaker Nodes: Node: ask-02 Node: ask-03 After Fix: [root@ask-02 test]# rpm -q pcs pcs-0.9.101-1.el6.noarch [root@ask-02 test]# pcs status | grep ask-02 Online: [ ask-02 ask-03 ] [root@ask-02 test]# pcs cluster standby ask-02 [root@ask-02 test]# pcs status | grep ask-02 Node ask-02: standby Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-1526.html |