Bug 1830552
Summary: | pcs status on remotes is not working on rhel8.2 any longer | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Michele Baldessari <michele> | ||||
Component: | pcs | Assignee: | Tomas Jelinek <tojeline> | ||||
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 8.2 | CC: | cfeist, cluster-maint, idevat, mlisik, mmazoure, mpospisi, nhostako, omular, tojeline | ||||
Target Milestone: | rc | Keywords: | Regression, ZStream | ||||
Target Release: | 8.3 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | pcs-0.10.6-1.el8 | Doc Type: | Bug Fix | ||||
Doc Text: |
Cause:
User runs 'pcs status' on a remote node.
Consequence:
Pcs exits with an error complaining corosync.conf is missing. This is wrong as corosync.conf is expected to be missing on remote nodes.
Fix:
If corosync.conf is missing, read cluster name from CIB instead of corosync.conf. Gracefully skip obtaining and displaying information which depend on corosync.conf presence.
Result:
The 'pcs status' command works on remote nodes.
|
Story Points: | --- | ||||
Clone Of: | |||||||
: | 1832914 (view as bug list) | Environment: | |||||
Last Closed: | 2020-11-04 02:28:16 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1832914 | ||||||
Attachments: |
|
Description
Michele Baldessari
2020-05-02 17:33:30 UTC
This regression was introduced when moving the status command to the new pcs architecture. corosync.conf is needed there for two reasons: 1) get the cluster name Here, pcs should check if the corosync.conf file exists. If it's missing, get the cluster name from CIB instead. 2) list nodes from corosync.conf to check if we can connect to pcsd on them This wasn't working before either. Since we have no list of nodes when corosync.conf is missing, this should be just skipped. # pcs status --full Cluster name: rhel82 Cluster Summary: * Stack: corosync * Current DC: rh82-node2 (2) (version 2.0.3-5.el8-4b1f869f0f) - partition with quorum * Last updated: Mon May 4 09:51:45 2020 * Last change: Mon May 4 09:36:51 2020 by root via cibadmin on rh82-node2 * 3 nodes configured * 5 resource instances configured Node List: * Online: [ rh82-node2 (2) rh82-node3 (3) ] * RemoteOnline: [ rh82-node1 ] Full List of Resources: * xvm (stonith:fence_xvm): Started rh82-node3 * d1 (ocf::pacemaker:Dummy): Started rh82-node1 * d2 (ocf::pacemaker:Dummy): Started rh82-node3 * d3 (ocf::pacemaker:Dummy): Started rh82-node2 * rh82-node1 (ocf::pacemaker:remote): Started rh82-node2 Migration Summary: PCSD Status: Error: Unable to read /etc/corosync/corosync.conf: No such file or directory Created attachment 1685156 [details] proposed fix + tests Test: * add a remote node to a cluster: pcs cluster node add-remote ... * run 'pcs status' on the remote node * details in comment 0 and comment 1 Test: root@r8-node-01 rpms]# rpm -q pcs pcs-0.10.6-1.el8.x86_64 [root@r8-node-02 ~]# rpm -q pcs pcs-0.10.6-1.el8.x86_64 [root@r8-node-02 ~]# pcs status nodes Pacemaker Nodes: Online: r8-node-01 Standby: Standby with resource(s) running: Maintenance: Offline: Pacemaker Remote Nodes: Online: r8-node-02 Standby: Standby with resource(s) running: Maintenance: Offline: [root@r8-node-02 ~]# pcs status --full Cluster name: HAcluster Cluster Summary: * Stack: corosync * Current DC: r8-node-01 (1) (version 2.0.3-5.el8-4b1f869f0f) - partition with quorum * Last updated: Thu Jun 11 16:35:22 2020 * Last change: Thu Jun 11 16:34:33 2020 by root via cibadmin on r8-node-01 * 2 nodes configured * 3 resource instances configured Node List: * Online: [ r8-node-01 (1) ] * RemoteOnline: [ r8-node-02 ] Full List of Resources: * fence-r8-node-01 (stonith:fence_xvm): Started r8-node-01 * fence-r8-node-02 (stonith:fence_xvm): Started r8-node-01 * r8-node-02 (ocf::pacemaker:remote): Started r8-node-01 Migration Summary: Tickets: Daemon Status: corosync: inactive/disabled pacemaker: inactive/disabled pacemaker_remote: active/enabled pcsd: active/disabled BEFORE_FIX ========= [root@virt-044 ~]# rpm -q pcs pcs-0.10.4-6.el8.x86_64 [root@virt-044 ~]# pcs cluster node add-remote virt-043 No addresses specified for host 'virt-043', using 'virt-043' Sending 'pacemaker authkey' to 'virt-043' virt-043: successful distribution of the file 'pacemaker authkey' Requesting 'pacemaker_remote enable', 'pacemaker_remote start' on 'virt-043' virt-043: successful run of 'pacemaker_remote enable' virt-043: successful run of 'pacemaker_remote start' [root@virt-044 sts-rhel8.3]# pcs status --full Cluster name: STSRHTS10850 Cluster Summary: * Stack: corosync * Current DC: virt-044 (1) (version 2.0.3-5.el8_2.1-4b1f869f0f) - partition with quorum * Last updated: Fri Jul 24 09:35:34 2020 * Last change: Fri Jul 24 09:35:27 2020 by root via cibadmin on virt-044 * 3 nodes configured * 3 resource instances configured Node List: * Online: [ virt-044 (1) virt-048 (2) ] * RemoteOnline: [ virt-043 ] Full List of Resources: * fence-virt-044 (stonith:fence_xvm): Started virt-048 * fence-virt-048 (stonith:fence_xvm): Started virt-048 * virt-043 (ocf::pacemaker:remote): Started virt-044 Migration Summary: Tickets: PCSD Status: virt-044: Online virt-048: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled # Check the remote node [root@virt-043 ~]# pcs cluster corosync Error: Unable to read /etc/corosync/corosync.conf: No such file or directory [root@virt-043 ~]# pcs status Error: Unable to read /etc/corosync/corosync.conf: No such file or directory > Status could not have been displayed on remote node because corosync.conf was unavailable AFTER_FIX ========= [root@virt-158 ~]# rpm -q pcs pcs-0.10.6-3.el8.x86_6 [root@virt-158 ~]# pcs cluster node add-remote virt-160 No addresses specified for host 'virt-160', using 'virt-160' Sending 'pacemaker authkey' to 'virt-160' virt-160: successful distribution of the file 'pacemaker authkey' Requesting 'pacemaker_remote enable', 'pacemaker_remote start' on 'virt-160' virt-160: successful run of 'pacemaker_remote enable' virt-160: successful run of 'pacemaker_remote start' [root@virt-158 ~]# pcs status --full Cluster name: STSRHTS32139 Cluster Summary: * Stack: corosync * Current DC: virt-159 (2) (version 2.0.4-3.el8-2deceaa3ae) - partition with quorum * Last updated: Fri Jul 24 08:47:27 2020 * Last change: Fri Jul 24 08:46:30 2020 by root via cibadmin on virt-158 * 3 nodes configured * 4 resource instances configured Node List: * Online: [ virt-158 (1) virt-159 (2) ] * RemoteOnline: [ virt-160 ] Full List of Resources: * fence-virt-158 (stonith:fence_xvm): Started virt-159 * fence-virt-159 (stonith:fence_xvm): Started virt-158 * fence-virt-160 (stonith:fence_xvm): Started virt-159 * virt-160 (ocf::pacemaker:remote): Started virt-158 Migration Summary: Tickets: PCSD Status: virt-158: Online virt-159: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled # Check the remote node [root@virt-160 ~]# pcs cluster corosync Error: Unable to read /etc/corosync/corosync.conf: No such file or directory [root@virt-160 ~]# pcs status --full Cluster name: STSRHTS32139 Cluster Summary: * Stack: corosync * Current DC: virt-159 (2) (version 2.0.4-3.el8-2deceaa3ae) - partition with quorum * Last updated: Fri Jul 24 08:49:03 2020 * Last change: Fri Jul 24 08:46:30 2020 by root via cibadmin on virt-158 * 3 nodes configured * 4 resource instances configured Node List: * Online: [ virt-158 (1) virt-159 (2) ] * RemoteOnline: [ virt-160 ] Full List of Resources: * fence-virt-158 (stonith:fence_xvm): Started virt-159 * fence-virt-159 (stonith:fence_xvm): Started virt-158 * fence-virt-160 (stonith:fence_xvm): Started virt-159 * virt-160 (ocf::pacemaker:remote): Started virt-158 Migration Summary: Tickets: Daemon Status: corosync: inactive/disabled pacemaker: inactive/disabled pacemaker_remote: active/enabled pcsd: active/enabled > Instead of corosync.conf, cluster name is taken from CIB, PCSD status is skipped > Status is available even though corosync.conf is not present on the remote node Marking verified in pcs-0.10.6-3.el8. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pcs bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:4617 |