1029129 – pcs cluster standby nodename doesn't work

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1029129 - pcs cluster standby nodename doesn't work

Summary: pcs cluster standby nodename doesn't work

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	pcs
Sub Component:
Version:	6.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	rc
Target Release:	---
Assignee:	Chris Feist
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1032159 1032161
TreeView+	depends on / blocked

Reported:	2013-11-11 17:46 UTC by Tuomo Soini
Modified:	2018-12-03 20:37 UTC (History)
CC List:	12 users (show)
Fixed In Version:	pcs-0.9.101-1.el6
Doc Type:	Bug Fix
Doc Text:	Prior to this update, the pcs utility was using an incorrect location to search for cluster node names, and the "pcs cluster standby" command therefore could not find the specified cluster node. As a a consequence, it was not possible to put cluster nodes in standby mode. With this update, pcs properly searches for node names in the /etc/cluster/cluster.conf file and putting cluster nodes in standby mode works correctly.
Clone Of:
Environment:
Last Closed:	2014-10-14 07:21:37 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Knowledge Base (Solution)	545793	0	None	None	None	Never
Red Hat Product Errata	RHBA-2014:1526	0	normal	SHIPPED_LIVE	pcs bug fix and enhancement update	2014-10-14 01:22:04 UTC

Description Tuomo Soini 2013-11-11 17:46:41 UTC

command "pcs cluster standby nodename" command gives error about node missing from config:

Error: node 'nodename' does not appear to exist in configuration

According strace pcs is trying to access /etc/corosync/corosync.conf

That's not going to work on rhel6 where /etc/cluster/cluster.conf should be consulted instead.

Comment 2 Tuomo Soini 2013-11-11 18:19:04 UTC

Version: 0.9.90 so latest update.

Comment 3 Robert Scheck 2013-11-11 19:39:24 UTC

We are experiencing exactly the same issue. We configured our setup like at

  http://floriancrouzat.net/2013/04/rhel-6-4-pacemaker-1-1-8-adding-cman-support-and-getting-rid-of-the-plugin/

described. Since updating from pcs-0.9.26-10.el6_4.1 to pcs-0.9.90-1.0.1.el6
"pcs cluster standby $(uname -n)" fails like stated above. An strace(1) shows
this for me:

  open("/etc/corosync/corosync.conf", O_RDONLY) = -1 ENOENT (No such file or directory)

Why the heck does it need corosync.conf(5) again? Isn't ccs(1) doing the job
anymore?

Cross-filed case 00979407 on the Red Hat customer portal.

Comment 4 Chris Feist 2013-11-11 20:02:44 UTC

Definitely a bug, it should be looking for the current pacemaker nodes, fixed upstream with this patch.

https://github.com/feist/pcs/commit/8b888080c37ddea88b92dfd95aadd78b9db68b55

As a workaround, you can run 'crm_standby -v on -N <nodename>' to standby a node or 'crm_standby -D -N <nodename>' to unstandby a node.

Comment 8 Robert Scheck 2013-11-13 15:21:30 UTC

Confirmed. Backporting the patch to a running system does the job for us:

--- snipp ---
--- /usr/lib/python2.6/site-packages/pcs/cluster.py
+++ /usr/lib/python2.6/site-packages/pcs/cluster.py.orig
@@ -360,7 +360,7 @@
             usage.cluster(["unstandby"])
         sys.exit(1)
 
-    nodes = utils.getNodesFromPacemaker()
+    nodes = utils.getNodesFromCorosyncConf()
 
     if "--all" not in utils.pcs_options:
         nodeFound = False
--- snapp ---

Any chance to get this in before RHEL 6.6 GA as GSS says? AFAIK this is now
going to be supported with RHEL 6.5 GA and all documentation out there talks 
about "pcs cluster (un)standby <nodename>". Breaking this is IMHO not good.

Comment 9 xrobau 2013-11-15 22:54:30 UTC

There's lots of other places where getNodesFromCorosyncConf is incorrectly referenced.

As that file does not exist on RHEL Clusters, all of these are incorrect:

[root@localhost pcs]# grep getNodesFromCorosyncConf *
cluster.py:        sync_nodes(utils.getNodesFromCorosyncConf(),utils.getCorosyncConf())
cluster.py:        auth_nodes(utils.getNodesFromCorosyncConf())
cluster.py:        nodes = utils.getNodesFromCorosyncConf()
cluster.py:    for node in utils.getNodesFromCorosyncConf():
cluster.py:    for node in utils.getNodesFromCorosyncConf():
cluster.py:    nodes = utils.getNodesFromCorosyncConf()
cluster.py:    for node in utils.getNodesFromCorosyncConf():
cluster.py:    for node in utils.getNodesFromCorosyncConf():
cluster.py:        for my_node in utils.getNodesFromCorosyncConf():
cluster.py:        for my_node in utils.getNodesFromCorosyncConf():
cluster.py:        for node in utils.getNodesFromCorosyncConf():
status.py:        corosync_nodes = utils.getNodesFromCorosyncConf()
status.py:        all_nodes = utils.getNodesFromCorosyncConf()
utils.py:def getNodesFromCorosyncConf():
utils.py:    for c_node in getNodesFromCorosyncConf():
utils.py:    for c_node in getNodesFromCorosyncConf():
utils.py:    c_nodes = getNodesFromCorosyncConf()

Comment 11 Mathieu Peltier 2013-11-19 15:17:48 UTC

I get also an error message concerning corosync.conf when running "pcs cluster status":
 
# pcs cluster status 
...
PCSD Status: 
Error: no nodes found in corosync.conf

Comment 14 Chris Feist 2013-12-05 00:21:20 UTC

Before Fix:
[root@ask-02 test]# rpm -q pcs
pcs-0.9.90-2.el6.noarch
[root@ask-02 test]# pcs cluster standby ask-02
Error: node 'ask-02' does not appear to exist in configuration
[root@ask-02 test]# pcs config | grep Node
Corosync Nodes:
Pacemaker Nodes:
 Node: ask-02
 Node: ask-03

After Fix:
[root@ask-02 test]# rpm -q pcs
pcs-0.9.101-1.el6.noarch
[root@ask-02 test]# pcs status | grep ask-02
Online: [ ask-02 ask-03 ]
[root@ask-02 test]# pcs cluster standby ask-02
[root@ask-02 test]# pcs status | grep ask-02
Node ask-02: standby

Comment 21 errata-xmlrpc 2014-10-14 07:21:37 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1526.html

Note You need to log in before you can comment on or make changes to this bug.