Bug 1780137
Summary: | Adding quorum device requires restart to clear WaitForAll flag | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Josef Zimek <pzimek> | ||||||
Component: | corosync | Assignee: | Jan Friesse <jfriesse> | ||||||
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 8.4 | CC: | ccaulfie, cluster-maint, cluster-qe, jfriesse, mnovacek, ondrej-redhat-developer, phagara | ||||||
Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
||||||
Target Release: | 8.0 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | corosync-3.0.3-4.el8 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | 1780134 | Environment: | |||||||
Last Closed: | 2020-11-04 03:25:51 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1780134 | ||||||||
Attachments: |
|
Created attachment 1654293 [details]
votequorum: Ignore the icmap_get_* return value
votequorum: Ignore the icmap_get_* return value
Express intention to ignore icmap_get_* return
value and rely on default behavior of not changing the output
parameter on error.
Signed-off-by: Jan Friesse <jfriesse>
For QE: Bug reproducer is described in the comment 1. I've tested with just setting two_node: 1 in corosync.conf. corosync.conf: ... quorum { provider: corosync_votequorum two_node: 1 ... # corosync-quorumtool ... Flags: 2Node WaitForAll ... Changing corosync.conf is it doesn't contain two_node: ... quorum { provider: corosync_votequorum ... # corosync-cfgtool -R # corosync-quorumtool ... Flags: ... Add two_node back: ... quorum { provider: corosync_votequorum two_node: 1 ... # corosync-quorumtool ... Flags: 2Node WaitForAll ... Common part ----------- Following quorum node adding using rhel8 workflow [1]. Quorum node added to two node cluster [2] Start with two node cluster: > [root@virt-245 ~]# grep two_node /etc/corosync/corosync.conf two_node: 1 > [root@virt-245 ~]# pcs quorum status | grep Flags Flags: 2Node Quorate WaitForAll > [root@virt-245 ~]# pcs quorum device add model net host=virt-020 ... # Two nodes from corosync.conf are gone even after sync. > [root@virt-245 ~]# pcs cluster sync corosync virt-245: Succeeded virt-246: Succeeded > [root@virt-245 ~]# grep two_node /etc/corosync/corosync.conf Before the fix corosync-3.0.3-2.el8.x86_64 ------------------------------------------ # WaitForAll flag is still present after quorum node were added > [root@virt-245 ~]# pcs quorum status | grep Flags Flags: Quorate WaitForAll Qdevice <<<<<<<<<<<< <cluster stop and start> # WaitForAll flag is gone > [root@virt-245 ~]# pcs quorum status | grep Flags Flags: Quorate Qdevice # Removing quorum device will reintroduce 2Node but not WaitForAll > [root@virt-245 ~]# pcs quorum device remove ... > [root@virt-245 ~]# grep two_node /etc/corosync/corosync.conf two_node: 1 > [root@virt-245 ~]# pcs quorum status | grep Flags Flags: 2Node Quorate WaitForAll <<<<<<<<<<<< After the fix corosync-3.0.3-4.el8.x86_64 ----------------------------------------- # WaitForAll flag is gone after quorum device is added > [root@virt-245 ~]# pcs quorum status | grep Flags Flags: Quorate Qdevice # Removing quorum device will reintroduce 2Node and WaitForAll flags > [root@virt-245 ~]# pcs quorum device remove ... > [root@virt-245 ~]# grep two_node /etc/corosync/corosync.conf two_node: 1 > [root@virt-245 ~]# pcs quorum status | grep Flags Flags: 2Node Quorate WaitForAll ----- >[1]: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/configuring_and_managing_high_availability_clusters/index >[2]: [root@virt-245 ~]# pcs quorum status Quorum information ------------------ Date: Thu Sep 17 11:17:30 2020 Quorum provider: corosync_votequorum Nodes: 2 Node ID: 1 Ring ID: 1.2c Quorate: Yes Votequorum information ---------------------- Expected votes: 2 Highest expected: 2 Total votes: 2 Quorum: 1 Flags: 2Node Quorate WaitForAll Membership information ---------------------- Nodeid Votes Qdevice Name 1 1 NR virt-245 (local) 2 1 NR virt-246 [root@virt-245 ~]# pcs status Cluster name: STSRHTS19388 Cluster Summary: * Stack: corosync * Current DC: virt-246 (version 2.0.3-5.el8_2.1-4b1f869f0f) - partition with quorum * Last updated: Thu Sep 17 11:17:37 2020 * Last change: Thu Sep 17 09:51:35 2020 by root via cibadmin on virt-245 * 2 nodes configured * 4 resource instances configured Node List: * Online: [ virt-245 virt-246 ] Full List of Resources: * fence-virt-245 (stonith:fence_xvm): Started virt-245 * fence-virt-246 (stonith:fence_xvm): Started virt-246 * dummy (ocf::pacemaker:Dummy): Started virt-245 * fence-virt-020 (stonith:fence_xvm): Started virt-246 Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (corosync bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4736 |
Created attachment 1654290 [details] votequorum: Reflect runtime change of 2Node to WFA votequorum: Reflect runtime change of 2Node to WFA When 2Node mode is set, WFA is also set unless WFA is configured explicitly. This behavior was not reflected on runtime change, so restarted corosync behavior was different (WFA not set). Also when cluster is reduced from 3 nodes to 2 nodes during runtime, WFA was not set, what may result in two quorate partitions. Solution is to set WFA depending on 2Node when WFA is not explicitly configured. Signed-off-by: Jan Friesse <jfriesse> Reviewed-by: Christine Caulfield <ccaulfie>