Bug 1710422
Summary: | Default the concurrent-fencing cluster property to true | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Shane Bradley <sbradley> | |
Component: | pacemaker | Assignee: | Ken Gaillot <kgaillot> | |
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | |
Severity: | medium | Docs Contact: | Steven J. Levine <slevine> | |
Priority: | high | |||
Version: | 7.6 | CC: | cluster-maint, kgaillot, marjones, phagara, pkomarov, sbradley | |
Target Milestone: | pre-dev-freeze | |||
Target Release: | 7.8 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | pacemaker-1.1.21-1.el7 | Doc Type: | Enhancement | |
Doc Text: |
.Default value of Pacemaker `concurrent-fencing` cluster property now set to `true`
Pacemaker now defaults the `concurrent-fencing` cluster property to `true`. If multiple nodes need to be fenced at the same time and they use different configured fence devices, Pacemaker will execute the fencing simultaneously rather than serialized as before. This can greatly speed up recovery in a large cluster when multiple nodes must be fenced.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1715426 (view as bug list) | Environment: | ||
Last Closed: | 2020-03-31 19:41:51 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1715426 |
Comment 3
Ken Gaillot
2019-05-16 15:48:19 UTC
QA: It is not necessary to reproduce the customer's issue, just verify that fencing is now parallelized by default. If more than one node needs to be fenced at the same time (e.g. kill power to 2 nodes in a 5-node cluster), by default pacemaker would wait until one fencing completed before executing the next. Now, as long as the fence targets have different fence devices (e.g. individual IPMI), pacemaker will execute the fencing in parallel. This is controlled by the concurrent-fencing cluster property, which previously defaulted to false, and now defaults to true. Added title and made slight edit to release note description. > [root@f09-h29-b04-5039ms ~]# rpm -q pacemaker > pacemaker-1.1.21-4.el7.x86_64 concurrent-fencing now defaults to true: > [root@f09-h29-b04-5039ms ~]# pcs property --all | grep concurrent-fencing > concurrent-fencing: true aftermath of killing 15 nodes in a 32-node cluster at once (using `halt -f`): > [root@f09-h29-b04-5039ms ~]# crm_mon -m -1 > Stack: corosync > Current DC: f09-h20-b03-5039ms (version 1.1.21-4.el7-f14e36fd43) - partition with quorum > Last updated: Fri Jan 31 19:40:11 2020 > Last change: Fri Jan 31 19:12:58 2020 by root via cibadmin on f09-h29-b04-5039ms > > 32 nodes configured > 32 resources configured > > Node f09-h17-b03-5039ms: pending > Node f09-h17-b06-5039ms: pending > Node f09-h17-b07-5039ms: pending > Node f09-h20-b01-5039ms: pending > Node f09-h20-b02-5039ms: pending > Node f09-h20-b05-5039ms: pending > Node f09-h20-b06-5039ms: pending > Node f09-h20-b07-5039ms: pending > Node f09-h23-b01-5039ms: pending > Node f09-h23-b02-5039ms: pending > Node f09-h23-b05-5039ms: pending > Node f09-h23-b07-5039ms: pending > Node f09-h26-b01-5039ms: pending > Node f09-h26-b03-5039ms: pending > Node f09-h29-b08-5039ms: pending > Online: [ f09-h17-b02-5039ms f09-h17-b04-5039ms f09-h17-b05-5039ms f09-h17-b08-5039ms f09-h20-b03-5039ms f09-h20-b04-5039ms f09-h20-b08-5039ms f09-h23-b03-5039ms f09-h23-b04-5039ms f09-h23-b06-5039ms f09-h23-b08-5039ms f09-h26-b02-5039ms f09-h26-b04-5039ms f09-h29-b04-5039ms f09-h29-b05-5039ms f09-h29-b06-5039ms f09-h29-b07-5039ms ] > > Active resources: > > fence-f09-h29-b04-5039ms (stonith:fence_ipmilan): Started f09-h17-b02-5039ms > fence-f09-h29-b05-5039ms (stonith:fence_ipmilan): Started f09-h17-b04-5039ms > fence-f09-h29-b06-5039ms (stonith:fence_ipmilan): Started f09-h17-b05-5039ms > fence-f09-h29-b07-5039ms (stonith:fence_ipmilan): Started f09-h17-b08-5039ms > fence-f09-h29-b08-5039ms (stonith:fence_ipmilan): Started f09-h20-b03-5039ms > fence-f09-h17-b02-5039ms (stonith:fence_ipmilan): Started f09-h20-b04-5039ms > fence-f09-h17-b03-5039ms (stonith:fence_ipmilan): Started f09-h20-b08-5039ms > fence-f09-h17-b04-5039ms (stonith:fence_ipmilan): Started f09-h23-b03-5039ms > fence-f09-h17-b05-5039ms (stonith:fence_ipmilan): Started f09-h23-b04-5039ms > fence-f09-h17-b06-5039ms (stonith:fence_ipmilan): Started f09-h23-b06-5039ms > fence-f09-h17-b07-5039ms (stonith:fence_ipmilan): Started f09-h23-b08-5039ms > fence-f09-h17-b08-5039ms (stonith:fence_ipmilan): Started f09-h26-b02-5039ms > fence-f09-h20-b01-5039ms (stonith:fence_ipmilan): Started f09-h26-b04-5039ms > fence-f09-h20-b02-5039ms (stonith:fence_ipmilan): Started f09-h29-b05-5039ms > fence-f09-h20-b03-5039ms (stonith:fence_ipmilan): Started f09-h29-b06-5039ms > fence-f09-h20-b04-5039ms (stonith:fence_ipmilan): Started f09-h29-b04-5039ms > fence-f09-h20-b05-5039ms (stonith:fence_ipmilan): Started f09-h29-b07-5039ms > fence-f09-h20-b06-5039ms (stonith:fence_ipmilan): Started f09-h17-b02-5039ms > fence-f09-h20-b07-5039ms (stonith:fence_ipmilan): Started f09-h17-b05-5039ms > fence-f09-h20-b08-5039ms (stonith:fence_ipmilan): Started f09-h17-b04-5039ms > fence-f09-h23-b01-5039ms (stonith:fence_ipmilan): Started f09-h17-b08-5039ms > fence-f09-h23-b02-5039ms (stonith:fence_ipmilan): Started f09-h20-b03-5039ms > fence-f09-h23-b03-5039ms (stonith:fence_ipmilan): Started f09-h20-b04-5039ms > fence-f09-h23-b04-5039ms (stonith:fence_ipmilan): Started f09-h20-b08-5039ms > fence-f09-h23-b05-5039ms (stonith:fence_ipmilan): Started f09-h23-b03-5039ms > fence-f09-h23-b06-5039ms (stonith:fence_ipmilan): Started f09-h23-b04-5039ms > fence-f09-h23-b07-5039ms (stonith:fence_ipmilan): Started f09-h23-b06-5039ms > fence-f09-h23-b08-5039ms (stonith:fence_ipmilan): Started f09-h23-b08-5039ms > fence-f09-h26-b01-5039ms (stonith:fence_ipmilan): Started f09-h26-b02-5039ms > fence-f09-h26-b02-5039ms (stonith:fence_ipmilan): Started f09-h26-b04-5039ms > fence-f09-h26-b03-5039ms (stonith:fence_ipmilan): Started f09-h29-b04-5039ms > fence-f09-h26-b04-5039ms (stonith:fence_ipmilan): Started f09-h29-b07-5039ms > > Fencing History: > * reboot of f09-h17-b07-5039ms successful: delegate=f09-h23-b08-5039ms, client=crmd.16434, origin=f09-h20-b03-5039ms, > last-successful='Fri Jan 31 19:35:29 2020' > * reboot of f09-h23-b02-5039ms successful: delegate=f09-h20-b03-5039ms, client=crmd.16434, origin=f09-h20-b03-5039ms, > last-successful='Fri Jan 31 19:35:29 2020' > * reboot of f09-h23-b01-5039ms successful: delegate=f09-h23-b04-5039ms, client=crmd.16434, origin=f09-h20-b03-5039ms, > last-successful='Fri Jan 31 19:35:29 2020' > * reboot of f09-h29-b08-5039ms successful: delegate=f09-h20-b03-5039ms, client=crmd.16434, origin=f09-h20-b03-5039ms, > last-successful='Fri Jan 31 19:35:29 2020' > * reboot of f09-h20-b01-5039ms successful: delegate=f09-h26-b02-5039ms, client=crmd.16434, origin=f09-h20-b03-5039ms, > last-successful='Fri Jan 31 19:35:29 2020' > * reboot of f09-h17-b06-5039ms successful: delegate=f09-h26-b04-5039ms, client=crmd.16434, origin=f09-h20-b03-5039ms, > last-successful='Fri Jan 31 19:35:28 2020' > * reboot of f09-h26-b01-5039ms successful: delegate=f09-h26-b04-5039ms, client=crmd.16434, origin=f09-h20-b03-5039ms, > last-successful='Fri Jan 31 19:35:28 2020' > * reboot of f09-h23-b07-5039ms successful: delegate=f09-h26-b04-5039ms, client=crmd.16434, origin=f09-h20-b03-5039ms, > last-successful='Fri Jan 31 19:35:28 2020' > * reboot of f09-h20-b06-5039ms successful: delegate=f09-h17-b02-5039ms, client=crmd.16434, origin=f09-h20-b03-5039ms, > last-successful='Fri Jan 31 19:35:28 2020' > * reboot of f09-h26-b03-5039ms successful: delegate=f09-h23-b04-5039ms, client=crmd.16434, origin=f09-h20-b03-5039ms, > last-successful='Fri Jan 31 19:35:28 2020' > * reboot of f09-h17-b03-5039ms successful: delegate=f09-h20-b03-5039ms, client=crmd.16434, origin=f09-h20-b03-5039ms, > last-successful='Fri Jan 31 19:35:28 2020' > * reboot of f09-h20-b07-5039ms successful: delegate=f09-h17-b05-5039ms, client=crmd.16434, origin=f09-h20-b03-5039ms, > last-successful='Fri Jan 31 19:35:28 2020' > * reboot of f09-h20-b05-5039ms successful: delegate=f09-h20-b03-5039ms, client=crmd.16434, origin=f09-h20-b03-5039ms, > last-successful='Fri Jan 31 19:35:28 2020' > * reboot of f09-h20-b02-5039ms successful: delegate=f09-h29-b05-5039ms, client=crmd.16434, origin=f09-h20-b03-5039ms, > last-successful='Fri Jan 31 19:35:27 2020' > * reboot of f09-h23-b05-5039ms successful: delegate=f09-h26-b04-5039ms, client=crmd.16434, origin=f09-h20-b03-5039ms, > last-successful='Fri Jan 31 19:35:27 2020' result: all nodes fenced within a 3-second interval disabling concurrent fencing: > [root@f09-h29-b04-5039ms ~]# pcs property set concurrent-fencing=false > [root@f09-h29-b04-5039ms ~]# pcs property --all | grep concurrent-fencing > concurrent-fencing: false and again killing 15 out of 32 nodes at once: > [root@f09-h17-b04-5039ms ~]# crm_mon -m -1 > Stack: corosync > Current DC: f09-h17-b04-5039ms (version 1.1.21-4.el7-f14e36fd43) - partition with quorum > Last updated: Fri Jan 31 19:55:13 2020 > Last change: Fri Jan 31 19:51:21 2020 by root via cibadmin on f09-h29-b04-5039ms > > 32 nodes configured > 32 resources configured > > Node f09-h23-b02-5039ms: UNCLEAN (offline) > Node f09-h23-b03-5039ms: UNCLEAN (offline) > Node f09-h23-b04-5039ms: UNCLEAN (offline) > Node f09-h23-b07-5039ms: UNCLEAN (offline) > Node f09-h26-b03-5039ms: UNCLEAN (offline) > Node f09-h29-b04-5039ms: UNCLEAN (offline) > Node f09-h29-b05-5039ms: UNCLEAN (offline) > Node f09-h29-b08-5039ms: UNCLEAN (offline) > Online: [ f09-h17-b02-5039ms f09-h17-b03-5039ms f09-h17-b04-5039ms f09-h17-b05-5039ms f09-h17-b07-5039ms f09-h17-b08-5039ms f09-h20-b01-5039ms f09-h20-b07-5039ms f09-h20-b08-5039ms f09-h23-b05-5039ms f09-h23-b06-5039ms f09-h23-b08-5039ms f09-h26-b01-5039ms f09-h26-b02-5039ms f09-h26-b04-5039ms f09-h29-b06-5039ms f09-h29-b07-5039ms ] > OFFLINE: [ f09-h17-b06-5039ms f09-h20-b02-5039ms f09-h20-b03-5039ms f09-h20-b04-5039ms f09-h20-b05-5039ms f09-h20-b06-5039ms f09-h23-b01-5039ms ] > > Active resources: > > fence-f09-h29-b04-5039ms (stonith:fence_ipmilan): Started f09-h17-b02-5039ms > fence-f09-h29-b05-5039ms (stonith:fence_ipmilan): Started f09-h17-b04-5039ms > fence-f09-h29-b06-5039ms (stonith:fence_ipmilan): Started f09-h17-b05-5039ms > fence-f09-h29-b07-5039ms (stonith:fence_ipmilan): Started f09-h17-b08-5039ms > fence-f09-h29-b08-5039ms (stonith:fence_ipmilan): Started f09-h17-b03-5039ms > fence-f09-h17-b02-5039ms (stonith:fence_ipmilan): Started f09-h17-b07-5039ms > fence-f09-h17-b03-5039ms (stonith:fence_ipmilan): Started f09-h20-b08-5039ms > fence-f09-h17-b04-5039ms (stonith:fence_ipmilan): Started[ f09-h23-b03-5039ms f09-h20-b01-5039ms ] > fence-f09-h17-b05-5039ms (stonith:fence_ipmilan): Started[ f09-h23-b04-5039ms f09-h20-b07-5039ms ] > fence-f09-h17-b06-5039ms (stonith:fence_ipmilan): Started f09-h23-b06-5039ms > fence-f09-h17-b07-5039ms (stonith:fence_ipmilan): Started f09-h23-b08-5039ms > fence-f09-h17-b08-5039ms (stonith:fence_ipmilan): Started f09-h26-b02-5039ms > fence-f09-h20-b01-5039ms (stonith:fence_ipmilan): Started f09-h26-b04-5039ms > fence-f09-h20-b02-5039ms (stonith:fence_ipmilan): Started[ f09-h23-b05-5039ms f09-h29-b05-5039ms ] > fence-f09-h20-b03-5039ms (stonith:fence_ipmilan): Started f09-h29-b06-5039ms > fence-f09-h20-b04-5039ms (stonith:fence_ipmilan): Started[ f09-h26-b01-5039ms f09-h29-b04-5039ms ] > fence-f09-h20-b05-5039ms (stonith:fence_ipmilan): Started f09-h29-b07-5039ms > fence-f09-h20-b06-5039ms (stonith:fence_ipmilan): Started f09-h17-b03-5039ms > fence-f09-h20-b07-5039ms (stonith:fence_ipmilan): Started f09-h17-b02-5039ms > fence-f09-h20-b08-5039ms (stonith:fence_ipmilan): Started f09-h17-b07-5039ms > fence-f09-h23-b01-5039ms (stonith:fence_ipmilan): Started f09-h20-b01-5039ms > fence-f09-h23-b02-5039ms (stonith:fence_ipmilan): Started f09-h17-b04-5039ms > fence-f09-h23-b03-5039ms (stonith:fence_ipmilan): Started f09-h17-b05-5039ms > fence-f09-h23-b04-5039ms (stonith:fence_ipmilan): Started f09-h17-b08-5039ms > fence-f09-h23-b05-5039ms (stonith:fence_ipmilan): Started f09-h20-b07-5039ms > fence-f09-h23-b06-5039ms (stonith:fence_ipmilan): Started f09-h20-b08-5039ms > fence-f09-h23-b07-5039ms (stonith:fence_ipmilan): Started[ f09-h23-b02-5039ms f09-h23-b05-5039ms ] > fence-f09-h23-b08-5039ms (stonith:fence_ipmilan): Started f09-h23-b06-5039ms > fence-f09-h26-b01-5039ms (stonith:fence_ipmilan): Started[ f09-h23-b08-5039ms f09-h23-b07-5039ms ] > fence-f09-h26-b02-5039ms (stonith:fence_ipmilan): Started f09-h26-b01-5039ms > fence-f09-h26-b03-5039ms (stonith:fence_ipmilan): Started[ f09-h26-b02-5039ms f09-h26-b03-5039ms ] > fence-f09-h26-b04-5039ms (stonith:fence_ipmilan): Started[ f09-h29-b08-5039ms f09-h26-b04-5039ms ] > > Fencing History: > * reboot of f09-h23-b02-5039ms pending: client=crmd.16632, origin=f09-h17-b04-5039ms > * reboot of f09-h23-b01-5039ms successful: delegate=f09-h20-b01-5039ms, client=crmd.16632, origin=f09-h17-b04-5039ms, > last-successful='Fri Jan 31 19:54:59 2020' > * reboot of f09-h20-b06-5039ms successful: delegate=f09-h17-b03-5039ms, client=crmd.16632, origin=f09-h17-b04-5039ms, > last-successful='Fri Jan 31 19:54:44 2020' > * reboot of f09-h20-b05-5039ms successful: delegate=f09-h29-b07-5039ms, client=crmd.16632, origin=f09-h17-b04-5039ms, > last-successful='Fri Jan 31 19:54:29 2020' > * reboot of f09-h20-b04-5039ms successful: delegate=f09-h26-b01-5039ms, client=crmd.16632, origin=f09-h17-b04-5039ms, > last-successful='Fri Jan 31 19:54:13 2020' > * reboot of f09-h20-b03-5039ms successful: delegate=f09-h29-b06-5039ms, client=crmd.16632, origin=f09-h17-b04-5039ms, > last-successful='Fri Jan 31 19:53:58 2020' > * reboot of f09-h20-b02-5039ms successful: delegate=f09-h23-b05-5039ms, client=crmd.16632, origin=f09-h17-b04-5039ms, > last-successful='Fri Jan 31 19:53:43 2020' > * reboot of f09-h17-b06-5039ms successful: delegate=f09-h23-b06-5039ms, client=crmd.16632, origin=f09-h17-b04-5039ms, > last-successful='Fri Jan 31 19:53:27 2020' result: reverted to old behavior -- fencing is serialized (~15s/node) marking verified in 1.1.21-4.el7 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:1032 |