Bug 1702727
Summary: | sbd doesn't detect non-responsive corosync-daemon | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Klaus Wenninger <kwenning> |
Component: | sbd | Assignee: | Klaus Wenninger <kwenning> |
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 8.0 | CC: | cfeist, cluster-maint, jfriesse, kgaillot, mlisik, mnovacek |
Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
Target Release: | 8.1 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | sbd-1.4.0-10.el8 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-11-05 20:46:42 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Klaus Wenninger
2019-04-24 15:04:40 UTC
I have verified tht sbd fencing works correctly when corosync dies in sbd-1.4.0-15.el8.x86_64. --- Before the fix: sbd-1.4.0-8.el8.x86_64 > virt-148$ killall -STOP corosync > virt-148$ sleep 60 && crm_mon 1 Cluster name: STSRHTS27505 Stack: corosync Current DC: virt-151.ipv6 (version 2.0.2-3.el8-744a30d655) - partition with quorum Last updated: Fri Sep 6 16:13:17 2019 Last change: Fri Sep 6 15:50:11 2019 by root via cibadmin on virt-148 3 nodes configured 6 resources configured Online: [ virt-148 virt-150 virt-151.ipv6 ] Full list of resources: Clone Set: locking-clone [locking] Started: [ virt-148 virt-150 virt-151.ipv6 ] Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled sbd: active/enabled > virt-150$ crm_mon -1 Stack: corosync Current DC: virt-151.ipv6 (version 2.0.2-3.el8-744a30d655) - partition with quorum Last updated: Fri Sep 6 16:13:49 2019 Last change: Fri Sep 6 15:50:11 2019 by root via cibadmin on virt-148 3 nodes configured 6 resources configured Node virt-148: UNCLEAN (offline) Online: [ virt-150 virt-151.ipv6 ] Active resources: Clone Set: locking-clone [locking] Resource Group: locking:1 dlm (ocf::pacemaker:controld): Started virt-148 (UNCLEAN) lvmlockd (ocf::heartbeat:lvmlockd): Started virt-148 (UNCLEAN) Started: [ virt-150 virt-151.ipv6 ] Failed Fencing Actions: * reboot of virt-148 failed: delegate=, client=stonith-api.8855, origin=virt-150, last-failed='Fri Sep 6 16:13:49 2019' --- Fixed version: sbd-1.4.0-15.el8.x86_64 > virt-148$ date Fri Sep 6 15:47:49 CEST 2019 > virt-148$ killall -STOP corosync > virt-150$ tail -f /var/log/cluster/corosync.log ... Sep 06 15:48:24 [25933] virt-150 corosync info [KNET ] link: host: 1 link: 0 is down Sep 06 15:48:24 [25933] virt-150 corosync info [KNET ] host: host: 1 (passive) best link: 0 (pri: 1) Sep 06 15:48:24 [25933] virt-150 corosync warning [KNET ] host: host: 1 has no active links Sep 06 15:48:24 [25933] virt-150 corosync notice [TOTEM ] Token has not been received in 1237 ms Sep 06 15:48:25 [25933] virt-150 corosync notice [TOTEM ] A processor failed, forming new configuration. Sep 06 15:48:27 [25933] virt-150 corosync notice [TOTEM ] A new membership (2:28) was formed. Members left: 1 Sep 06 15:48:27 [25933] virt-150 corosync notice [TOTEM ] Failed to receive the leave message. failed: 1 Sep 06 15:48:27 [25933] virt-150 corosync warning [CPG ] downlist left_list: 1 received Sep 06 15:48:27 [25933] virt-150 corosync warning [CPG ] downlist left_list: 1 received Sep 06 15:48:27 [25933] virt-150 corosync notice [QUORUM] Members[2]: 2 3 Sep 06 15:48:27 [25933] virt-150 corosync notice [MAIN ] Completed service synchronization, ready to provide service. Sep 06 15:50:08 [25933] virt-150 corosync info [KNET ] rx: host: 1 link: 0 is up Sep 06 15:50:08 [25933] virt-150 corosync info [KNET ] host: host: 1 (passive) best link: 0 (pri: 1) Sep 06 15:50:08 [25933] virt-150 corosync notice [TOTEM ] A new membership (1:32) was formed. Members joined: 1 Sep 06 15:50:08 [25933] virt-150 corosync warning [CPG ] downlist left_list: 0 received Sep 06 15:50:08 [25933] virt-150 corosync warning [CPG ] downlist left_list: 0 received Sep 06 15:50:08 [25933] virt-150 corosync warning [CPG ] downlist left_list: 0 received Sep 06 15:50:08 [25933] virt-150 corosync notice [QUORUM] Members[3]: 1 2 3 Sep 06 15:50:08 [25933] virt-150 corosync notice [MAIN ] Completed service synchronization, ready to provide service. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3344 |