Bug 1679792
| Summary: | Inconsistent quorum when wait_for_all is set | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Miroslav Lisik <mlisik> | ||||
| Component: | corosync | Assignee: | Jan Friesse <jfriesse> | ||||
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 7.6 | CC: | ccaulfie, cluster-maint, phagara | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | corosync-2.4.5-5.el7 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 1816653 (view as bug list) | Environment: | |||||
| Last Closed: | 2020-09-29 19:55:11 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1816653 | ||||||
| Attachments: |
|
||||||
Forgot to add comment. Issue is really easily reproducible (that's good :) ) It's not yet clear how to fix this. We've had discussion with chrissie/fabio and result is that there is no reason why quorum and votequorum output should differ. Right now there are two possible solutions known: - current cluster becomes non-quorate till new node appears - "standard" calculations are used First solution seems to be more nature (I like Chrissie comment: it's "wait_for_all", not "wait_for_some"), but may have some problems, most notably with last man standing. It must be also properly tested what configuration change makes corosync wait_for_all. Moving to 7.8. Solution should be quite easy but we have to find out all corner cases. Bug is there since 7.0, so shouldn't be a big deal. Also because this affects also corosync 3.0 (RHEL 8) we must clone this BZ to 8.1/8.2 when fix become ready. qa_ack+, reproducer in description Problem described in description is solved by patch in bug 1780134, but problem appears in different situations as described in the upstream comment https://github.com/corosync/corosync/pull/542#issuecomment-597207397. For QA: https://github.com/corosync/corosync/pull/542#issuecomment-597207397 contains tested scenarios. Created attachment 1673072 [details]
votequorum: set wfa status only on startup
votequorum: set wfa status only on startup
Previously reload of configuration with enabled wait_for_all result in
set of wait_for_all_status which set cluster_is_quorate to 0 but didn't
inform the quorum service so votequorum and quorum information may get
out of sync.
Example is 1 node cluster, which is extended to 3 nodes. Quorum service
reports cluster as a quorate (incorrect) and votequorum as not-quorate
(correct). Similar behavior happens when extending cluster in general,
but some configurations are less incorrect (3->4).
Discussed solution was to inform quorum service but that would mean
every reload would cause loss of quorum until all nodes would be seen
again.
Such behaviour is consistent but seems to be a bit too strict.
Proposed solution sets wait_for_all_status only on startup and
doesn't touch it during reload.
This solution fulfills requirement of "cluster will be quorate for
the first time only after all nodes have been visible at least
once at the same time." because node clears wait_for_all_status only
after it sees all other nodes or joins cluster which is quorate. It also
solves problem with extending cluster, because when cluster becomes
unquorate (1->3) wait_for_all_status is set.
Added assert is only for ensure that I haven't missed any case when
quorate cluster may become unquorate.
Signed-off-by: Jan Friesse <jfriesse>
Reviewed-by: Christine Caulfield <ccaulfie>
(cherry picked from commit ca320beac25f82c0c555799e647a47975a333c28)
before (rhel-7.8, corosync-2.4.5-4.el7)
=======================================
[root@virt-173 ~]# rpm -q corosync
corosync-2.4.5-4.el7.x86_64
[root@virt-173 ~]# pcs status
Cluster name: STSRHTS28883
Stack: corosync
Current DC: virt-173 (version 1.1.21-4.el7-f14e36fd43) - partition with quorum
Last updated: Tue May 26 13:50:13 2020
Last change: Tue May 26 13:48:25 2020 by root via cibadmin on virt-173
2 nodes configured
7 resources configured
Online: [ virt-173 virt-175 ]
Full list of resources:
fence-virt-173 (stonith:fence_xvm): Started virt-173
fence-virt-175 (stonith:fence_xvm): Started virt-175
fence-virt-178 (stonith:fence_xvm): Started virt-173
dummy-1 (ocf::pacemaker:Dummy): Started virt-175
dummy-2 (ocf::pacemaker:Dummy): Started virt-173
dummy-3 (ocf::pacemaker:Dummy): Started virt-175
dummy-4 (ocf::pacemaker:Dummy): Started virt-173
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
[root@virt-173 ~]# pcs quorum status
Quorum information
------------------
Date: Tue May 26 13:50:17 2020
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 1
Ring ID: 1/26
Quorate: Yes
Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 2
Quorum: 1
Flags: 2Node Quorate WaitForAll
Membership information
----------------------
Nodeid Votes Qdevice Name
1 1 NR virt-173 (local)
2 1 NR virt-175
[root@virt-173 ~]# pcs cluster node add virt-178
Disabling SBD service...
virt-178: sbd disabled
Sending remote node configuration files to 'virt-178'
virt-178: successful distribution of the file 'pacemaker_remote authkey'
virt-173: Corosync updated
virt-175: Corosync updated
Setting up corosync...
virt-178: Succeeded
Synchronizing pcsd certificates on nodes virt-178...
virt-178: Success
Restarting pcsd on the nodes in order to reload the certificates...
virt-178: Success
[root@virt-173 ~]# pcs quorum status
Quorum information
------------------
Date: Tue May 26 13:52:25 2020
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 1
Ring ID: 1/26
Quorate: Yes
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 2
Quorum: 2 Activity blocked
Flags: WaitForAll
Membership information
----------------------
Nodeid Votes Qdevice Name
1 1 NR virt-173 (local)
2 1 NR virt-175
[root@virt-173 ~]# pcs resource
dummy-1 (ocf::pacemaker:Dummy): Started virt-175
dummy-2 (ocf::pacemaker:Dummy): Started virt-173
dummy-3 (ocf::pacemaker:Dummy): Started virt-175
dummy-4 (ocf::pacemaker:Dummy): Started virt-173
result: quorum and votequorum are out of sync (quorate vs not)
after (rhel-7.9, corosync-2.4.5-5.el7)
======================================
[root@virt-053 ~]# rpm -q corosync
corosync-2.4.5-5.el7.x86_64
[root@virt-053 ~]# pcs status
Cluster name: STSRHTS12710
Stack: corosync
Current DC: virt-060 (version 1.1.22-1.el7-63d2d79005) - partition with quorum
Last updated: Tue May 26 13:58:35 2020
Last change: Tue May 26 13:58:20 2020 by root via cibadmin on virt-053
2 nodes configured
7 resource instances configured
Online: [ virt-053 virt-060 ]
Full list of resources:
fence-virt-053 (stonith:fence_xvm): Started virt-053
fence-virt-060 (stonith:fence_xvm): Started virt-060
fence-virt-070 (stonith:fence_xvm): Started virt-053
dummy-1 (ocf::pacemaker:Dummy): Started virt-060
dummy-2 (ocf::pacemaker:Dummy): Started virt-053
dummy-3 (ocf::pacemaker:Dummy): Started virt-060
dummy-4 (ocf::pacemaker:Dummy): Started virt-053
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
[root@virt-053 ~]# pcs quorum status
Quorum information
------------------
Date: Tue May 26 13:58:43 2020
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 1
Ring ID: 1/35
Quorate: Yes
Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 2
Quorum: 1
Flags: 2Node Quorate WaitForAll
Membership information
----------------------
Nodeid Votes Qdevice Name
1 1 NR virt-053 (local)
2 1 NR virt-060
[root@virt-053 ~]# pcs cluster node add virt-070
Disabling SBD service...
virt-070: sbd disabled
Sending remote node configuration files to 'virt-070'
virt-070: successful distribution of the file 'pacemaker_remote authkey'
virt-053: Corosync updated
virt-060: Corosync updated
Setting up corosync...
virt-070: Succeeded
Synchronizing pcsd certificates on nodes virt-070...
virt-070: Success
Restarting pcsd on the nodes in order to reload the certificates...
virt-070: Success
[root@virt-053 ~]# pcs quorum status
Quorum information
------------------
Date: Tue May 26 13:59:40 2020
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 1
Ring ID: 1/35
Quorate: Yes
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 2
Quorum: 2
Flags: Quorate
Membership information
----------------------
Nodeid Votes Qdevice Name
1 1 NR virt-053 (local)
2 1 NR virt-060
[root@virt-053 ~]# pcs resource
dummy-1 (ocf::pacemaker:Dummy): Started virt-060
dummy-2 (ocf::pacemaker:Dummy): Started virt-053
dummy-3 (ocf::pacemaker:Dummy): Started virt-060
dummy-4 (ocf::pacemaker:Dummy): Started virt-053
result: both quorum and votequorum report same state (quorate)
marking verified in corosync-2.4.5-5.el7
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (corosync bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3924 |
Description of problem: Quorum is inconsistent after adding a third node to cluster. The information does not match reality. Version-Release number of selected component (if applicable): # rpm -q corosync pacemaker corosync-2.4.3-4.el7.x86_64 pacemaker-1.1.19-8.el7_6.4.x86_64 How reproducible: always Steps to Reproduce: 1. Setup 2-node cluster with few dummy resources. [root@virt-025 ~]# pcs cluster auth -u hacluster -p password virt-025 virt-026 virt-025: Authorized virt-026: Authorized [root@virt-025 ~]# pcs cluster setup --name HAcluster virt-025 virt-026 --start Destroying cluster on nodes: virt-025, virt-026... virt-026: Stopping Cluster (pacemaker)... virt-025: Stopping Cluster (pacemaker)... virt-026: Successfully destroyed cluster virt-025: Successfully destroyed cluster Sending 'pacemaker_remote authkey' to 'virt-025', 'virt-026' virt-025: successful distribution of the file 'pacemaker_remote authkey' virt-026: successful distribution of the file 'pacemaker_remote authkey' Sending cluster config files to the nodes... virt-025: Succeeded virt-026: Succeeded Starting cluster on nodes: virt-025, virt-026... virt-026: Starting Cluster (corosync)... virt-025: Starting Cluster (corosync)... virt-026: Starting Cluster (pacemaker)... virt-025: Starting Cluster (pacemaker)... Synchronizing pcsd certificates on nodes virt-025, virt-026... virt-025: Success virt-026: Success Restarting pcsd on the nodes in order to reload the certificates... virt-025: Success virt-026: Success [root@virt-025 ~]# pcs stonith create fence-virt-025 fence_xvm pcmk_host_check="static-list" pcmk_host_list="virt-025" pcmk_host_map="virt-025:virt-025.cluster-qe.lab.eng.brq.redhat.com" [root@virt-025 ~]# pcs stonith create fence-virt-026 fence_xvm pcmk_host_check="static-list" pcmk_host_list="virt-026" pcmk_host_map="virt-026:virt-026.cluster-qe.lab.eng.brq.redhat.com" [root@virt-025 ~]# for i in $(seq 1 4); do pcs resource create "d-$i" ocf:pacemaker:Dummy; done [root@virt-025 ~]# pcs resource d-1 (ocf::pacemaker:Dummy): Started virt-025 d-2 (ocf::pacemaker:Dummy): Started virt-026 d-3 (ocf::pacemaker:Dummy): Started virt-025 d-4 (ocf::pacemaker:Dummy): Started virt-026 [root@virt-025 ~]# pcs quorum status Quorum information ------------------ Date: Thu Feb 21 21:34:53 2019 Quorum provider: corosync_votequorum Nodes: 2 Node ID: 1 Ring ID: 1/1240 Quorate: Yes Votequorum information ---------------------- Expected votes: 2 Highest expected: 2 Total votes: 2 Quorum: 1 Flags: 2Node Quorate WaitForAll Membership information ---------------------- Nodeid Votes Qdevice Name 1 1 NR virt-025 (local) 2 1 NR virt-026 2. Add a third node to cluster. [root@virt-025 ~]# pcs cluster auth -u hacluster -p password virt-032 virt-032: Authorized [root@virt-025 ~]# pcs cluster node add virt-032 Disabling SBD service... virt-032: sbd disabled Sending remote node configuration files to 'virt-032' virt-032: successful distribution of the file 'pacemaker_remote authkey' virt-025: Corosync updated virt-026: Corosync updated Setting up corosync... virt-032: Succeeded Synchronizing pcsd certificates on nodes virt-032... virt-032: Success Restarting pcsd on the nodes in order to reload the certificates... virt-032: Success 3. Check the quorum information and state of resources: [root@virt-025 ~]# pcs quorum status Quorum information ------------------ Date: Thu Feb 21 21:38:24 2019 Quorum provider: corosync_votequorum Nodes: 2 Node ID: 1 Ring ID: 1/1240 Quorate: Yes Votequorum information ---------------------- Expected votes: 3 Highest expected: 3 Total votes: 2 Quorum: 2 Activity blocked Flags: WaitForAll Membership information ---------------------- Nodeid Votes Qdevice Name 1 1 NR virt-025 (local) 2 1 NR virt-026 [root@virt-025 ~]# pcs resource d-1 (ocf::pacemaker:Dummy): Started virt-025 d-2 (ocf::pacemaker:Dummy): Started virt-026 d-3 (ocf::pacemaker:Dummy): Started virt-025 d-4 (ocf::pacemaker:Dummy): Started virt-026 [root@virt-025 ~]# pcs quorum status | grep -E "Quorate:|Quorum:|Flags:" Quorate: Yes Quorum: 2 Activity blocked Flags: WaitForAll Actual results: Quorum information is not consistent with state of resources. Expected results: Quorum information should be consistent with state of resources. Additional info: Quorum is consistent when node is added to a 2-node cluster with wait_for_all=0 (turned off). [root@virt-025 ~]# pcs cluster stop --all ... [root@virt-025 ~]# pcs quorum update wait_for_all=0 ... [root@virt-025 ~]# pcs cluster start --all ... [root@virt-025 ~]# pcs quorum Options: wait_for_all: 0 [root@virt-025 ~]# pcs quorum status Quorum information ------------------ Date: Thu Feb 21 21:44:30 2019 Quorum provider: corosync_votequorum Nodes: 2 Node ID: 1 Ring ID: 1/1248 Quorate: Yes Votequorum information ---------------------- Expected votes: 2 Highest expected: 2 Total votes: 2 Quorum: 1 Flags: 2Node Quorate Membership information ---------------------- Nodeid Votes Qdevice Name 1 1 NR virt-025 (local) 2 1 NR virt-026 [root@virt-025 ~]# pcs cluster node add virt-032 ... [root@virt-025 ~]# pcs quorum status Quorum information ------------------ Date: Thu Feb 21 21:45:46 2019 Quorum provider: corosync_votequorum Nodes: 2 Node ID: 1 Ring ID: 1/1248 Quorate: Yes Votequorum information ---------------------- Expected votes: 3 Highest expected: 3 Total votes: 2 Quorum: 2 Flags: Quorate Membership information ---------------------- Nodeid Votes Qdevice Name 1 1 NR virt-025 (local) 2 1 NR virt-026 [root@virt-025 ~]# pcs resource d-1 (ocf::pacemaker:Dummy): Started virt-025 d-2 (ocf::pacemaker:Dummy): Started virt-026 d-3 (ocf::pacemaker:Dummy): Started virt-025 d-4 (ocf::pacemaker:Dummy): Started virt-026 [root@virt-025 ~]# pcs quorum status | grep -E "Quorate:|Quorum:|Flags:" Quorate: Yes Quorum: 2 Flags: Quorate