Bug 1367813
| Summary: | Shutting down N-1 nodes at once causes cluster with lms qdevice to lose quorum | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Roman Bednář <rbednar> | ||||||||||
| Component: | corosync | Assignee: | Jan Friesse <jfriesse> | ||||||||||
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||||||||
| Severity: | unspecified | Docs Contact: | |||||||||||
| Priority: | unspecified | ||||||||||||
| Version: | 7.3 | CC: | ccaulfie, cluster-maint, jfriesse, jkortus, mjuricek, rbednar | ||||||||||
| Target Milestone: | rc | ||||||||||||
| Target Release: | --- | ||||||||||||
| Hardware: | x86_64 | ||||||||||||
| OS: | Linux | ||||||||||||
| Whiteboard: | |||||||||||||
| Fixed In Version: | corosync-2.4.0-4.el7 | Doc Type: | If docs needed, set a value | ||||||||||
| Doc Text: | Story Points: | --- | |||||||||||
| Clone Of: | Environment: | ||||||||||||
| Last Closed: | 2016-11-04 06:50:09 UTC | Type: | Bug | ||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||
| Documentation: | --- | CRM: | |||||||||||
| Verified Versions: | Category: | --- | |||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
| Embargoed: | |||||||||||||
| Bug Depends On: | |||||||||||||
| Bug Blocks: | 614122 | ||||||||||||
| Attachments: |
|
||||||||||||
Reassigning to Chrissie because LMS is her field. @Martin: Is the same problem happening also with ffsplit? QE guys, debug logs would be helpful. Qnetd logs to syslog but it has to be configured so open /etc/sysconfig/corosync-qnetd and change line:
COROSYNC_QNETD_OPTIONS=""
to
COROSYNC_QNETD_OPTIONS="-dd"
Qdevice logs are depending on corosync.conf configuration but generally, syslog is enabled so please use:
logging {
...
logger_subsys {
subsys: QDEVICE
debug: on
}
...
}
configuration in /etc/corosync/corosync.conf before reporting problems.
Also comment to both "reports". I'm unable to reproduce any one of them (trying killall -9 corosync or sysrg trigger). Would you mind to share corosync.conf (together with debug logs)? As discussed with Martin, I was able to reproduce the issue. It is really needed to crash node rather than just stop corosync and/or qdevice. Basically what happens: - Node 1 dies but disconnect cannot be send - Node 2 finds out node 1 is dead and starts forming new membership sending membership change to qnetd - Qnetd LMS algo sees Node 1 as alive and Node 2 as split but not leader -> sends NACK to Node 2 - Eventually Qnetd finds out Node 2 died Solution used in ffsplit is that qnetd_algo_lms_client_disconnect is handled and current status is revalued. This is probably not a good choice for lms, because lms keeps vote (if has one) till change (to overcome problem with accidental disconnect from qnetd). Created attachment 1194051 [details]
Proposed patch
Solves situation when in 2 node cluster tie-breaker node dies. Because
code contains two bugs, other node got NACK instead of ACK.
- Algo timer is not stack, so calling abort and schedule in timer
callback without setting reschedule is noop.
- It's needed to check not only what current node thinks about
membership, but also what other nodes thinks. If views diverge -> wait.
Just a note, I'm still unable to reproduce first bug reported by Roman. Roman, can you please paste a logs as Martin did? Martine,
thanks for logs. For the next time, please make sure to set
logging {
...
logger_subsys {
subsys: QDEVICE
debug: on
}
...
}
(please note subsys: QDEVICE not subsys: VOTEQ). Anyway, I kind of believe that proposed patch solves also this problem. You mind to test scratch build?
Sounds great, thanks for testing! Created attachment 1195934 [details]
Patch with slightly better English comments
ACK to the patch, thanks for spotting that.
I've fixed the English in the comments somewhat but the logic seems fine to me.
Chrissie, thanks for review. Path is now in upstream as b0c850f308d44ddcdf1a1f881c1e1142ad489385 Created attachment 1196271 [details]
Man: Fix corosync-qdevice-net-certutil link
Created attachment 1196272 [details]
man: mention qdevice incompatibilites in votequorum.5
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2463.html |
Description of problem: See subject. Also when the nodes are shut down one by one with a delay, this issue does not occur and cluster retains quorum as expected, even with only one node online. Version-Release number of selected component (if applicable): corosync-qnetd-2.4.0-3.el7.x86_64 corosynclib-2.4.0-3.el7.x86_64 corosync-qdevice-2.4.0-3.el7.x86_64 corosync-2.4.0-3.el7.x86_64 pcs-0.9.152-6.el7.x86_64 How reproducible: always Steps to Reproduce: 1) have a 3 node cluster with qdevice setup on separate node, set to lms algorithm 2) kill 2 nodes at once, quorum is 3 at this point and it seem that we loose a vote from qdevice causing cluster to lose quorum 3) cluster and qdevice status: # pcs qdevice status net QNetd address: *:5403 TLS: Supported (client certificate required) Connected clients: 1 Connected clusters: 1 Cluster "STSRHTS10485": Algorithm: LMS Tie-breaker: Node with lowest node ID Node ID 1: Client address: ::ffff:192.168.0.137:47712 Configured node list: 1, 2, 3 Membership node list: 1 Vote: NACK (NACK # pcs quorum status Quorum information ------------------ Date: Wed Aug 17 16:09:55 2016 Quorum provider: corosync_votequorum Nodes: 1 Node ID: 1 Ring ID: 1/192 Quorate: No Votequorum information ---------------------- Expected votes: 5 Highest expected: 5 Total votes: 1 Quorum: 3 Activity blocked Flags: Qdevice Membership information ---------------------- Nodeid Votes Qdevice Name 1 1 A,NV,NMW virt-136 (local) 0 0 Qdevice (votes 2) # pcs status Cluster name: STSRHTS10485 Stack: corosync Current DC: virt-136 (version 1.1.15-9.el7-e174ec8) - partition WITHOUT quorum Last updated: Wed Aug 17 15:49:02 2016 Last change: Tue Aug 16 17:41:30 2016 by root via crm_node on virt-136 3 nodes and 12 resources configured Node virt-139: UNCLEAN (offline) Node virt-140: UNCLEAN (offline) Online: [ virt-136 ] Full list of resources: fence-virt-136 (stonith:fence_xvm): Started virt-136 fence-virt-139 (stonith:fence_xvm): Started virt-139 (UNCLEAN) fence-virt-140 (stonith:fence_xvm): Started virt-140 (UNCLEAN) fence-virt-141 (stonith:fence_xvm): Started virt-139 (UNCLEAN) Clone Set: dlm-clone [dlm] dlm (ocf::pacemaker:controld): Started virt-139 (UNCLEAN) dlm (ocf::pacemaker:controld): Started virt-140 (UNCLEAN) Started: [ virt-136 ] Clone Set: clvmd-clone [clvmd] clvmd (ocf::heartbeat:clvm): Started virt-139 (UNCLEAN) clvmd (ocf::heartbeat:clvm): Started virt-140 (UNCLEAN) Started: [ virt-136 ] IP (ocf::heartbeat:IPaddr2): Started virt-136 Webserver (ocf::heartbeat:apache): Started virt-136 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled # pcs quorum device status Qdevice information ------------------- Model: Net Node ID: 1 Configured node list: 0 Node ID = 1 1 Node ID = 2 2 Node ID = 3 Membership node list: 1 Qdevice-net information ---------------------- Cluster name: STSRHTS10485 QNetd host: 192.168.0.136:5403 Algorithm: LMS Tie-breaker: Node with lowest node ID State: Connected ====================================================== Actual results: quorum lost Expected results: quorum should be retained since we have qdevice connection from/to remaining node. This is basically the purpose of qdevice with such setup and it's an advantage from 'standard' LMS cluster.