Bug 1625671
| Summary: | Improvement of communication between lrmd and stonith-ng during cluster startup | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Ondrej Benes <obenes> | |
| Component: | pacemaker | Assignee: | Jan Pokorný [poki] <jpokorny> | |
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | |
| Severity: | medium | Docs Contact: | ||
| Priority: | high | |||
| Version: | 7.5 | CC: | cfeist, cluster-maint, cww, faaland1, kgaillot, phagara | |
| Target Milestone: | rc | |||
| Target Release: | 7.8 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | pacemaker-1.1.21-1.el7 | Doc Type: | No Doc Update | |
| Doc Text: |
This affects a small subset of users and was not customer-reported.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1710988 (view as bug list) | Environment: | ||
| Last Closed: | 2020-03-31 19:41:51 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1710988 | |||
| Bug Blocks: | 1729984 | |||
|
Description
Ondrej Benes
2018-09-05 13:33:36 UTC
qa_ack+ on large cluster deployments (ie. 16 nodes and 100+ resources), starting the cluster MUST NOT result in timeout errors/warnings in logs or excessive CPU usage of the stonithd process Bumping to 7.8 due to time constraints Fixed upstream by commits 65170ff~1..3401f25 (master branch, to land in RHEL 8.1) and eee76118~1..428a9c8 (1.1 branch, to land in RHEL 7.8) *** Bug 1789191 has been marked as a duplicate of this bug. *** We are encountering this on a production system. For my tracking purposes, our local ticket is TOSS-4717 (In reply to Olaf Faaland from comment #13) > We are encountering this on a production system. > For my tracking purposes, our local ticket is TOSS-4717 That is, a production RHEL 7.7 system environment: 16-node VM (single CPU core, 2 GB memory) cluster with 50 cloned GFS2 resources + DLM/CLVMD clones and fencing (automated test case "LargeGfs2Deployment") before (1.1.20-5.el7) ===================== examining the DC node while running the test case -- high system load, stonithd process hogs the CPU: > [root@virt-196 ~]# uptime > 15:04:42 up 1:48, 1 user, load average: 2.33, 1.57, 1.46 > [root@virt-196 ~]# ps faux [...] > root 2702 0.0 0.4 132976 8648 ? Ss 13:16 0:00 /usr/sbin/pacemakerd -f > haclust+ 2717 1.7 2.1 162748 44364 ? Ss 13:16 1:54 \_ /usr/libexec/pacemaker/cib > root 2718 48.8 1.7 162456 35768 ? Rs 13:16 52:38 \_ /usr/libexec/pacemaker/stonithd > root 2719 0.0 0.2 100888 5264 ? Ss 13:16 0:01 \_ /usr/libexec/pacemaker/lrmd > haclust+ 2720 0.0 0.5 130172 10680 ? Ss 13:16 0:03 \_ /usr/libexec/pacemaker/attrd > haclust+ 2721 0.1 4.8 197276 100976 ? Ss 13:16 0:09 \_ /usr/libexec/pacemaker/pengine > haclust+ 2723 0.1 4.6 257444 96432 ? Ss 13:16 0:12 \_ /usr/libexec/pacemaker/crmd [...] logs shows frequent "update_cib_stonith_devices_v2" and "cib_devices_update" events from stonigh-ng which seem to always stall the cluster for a few seconds: > Feb 05 15:09:07 [2723] virt-196 crmd: info: match_graph_event: Action clvmd_monitor_30000 (89) confirmed on virt-191 (rc=0) > Feb 05 15:09:08 [2718] virt-196 stonith-ng: info: update_cib_stonith_devices_v2: Updating device list from the cib: create lrm_resource[@id='dlm'] > Feb 05 15:09:08 [2718] virt-196 stonith-ng: info: cib_devices_update: Updating devices to version 0.30.7540 > Feb 05 15:09:09 [2718] virt-196 stonith-ng: info: update_cib_stonith_devices_v2: Updating device list from the cib: modify lrm_rsc_op[@id='dlm_monitor_30000'] > Feb 05 15:09:09 [2718] virt-196 stonith-ng: info: cib_devices_update: Updating devices to version 0.30.7541 > Feb 05 15:09:09 [2718] virt-196 stonith-ng: info: update_cib_stonith_devices_v2: Updating device list from the cib: modify lrm_rsc_op[@id='clvmd_last_0'] > Feb 05 15:09:09 [2718] virt-196 stonith-ng: info: cib_devices_update: Updating devices to version 0.30.7542 > Feb 05 15:09:10 [2718] virt-196 stonith-ng: info: update_cib_stonith_devices_v2: Updating device list from the cib: modify lrm_rsc_op[@id='clvmd_last_0'] > Feb 05 15:09:10 [2718] virt-196 stonith-ng: info: cib_devices_update: Updating devices to version 0.30.7543 > Feb 05 15:09:11 [2718] virt-196 stonith-ng: info: update_cib_stonith_devices_v2: Updating device list from the cib: create lrm_resource[@id='clvmd'] > Feb 05 15:09:11 [2718] virt-196 stonith-ng: info: cib_devices_update: Updating devices to version 0.30.7544 > Feb 05 15:09:12 [2718] virt-196 stonith-ng: info: update_cib_stonith_devices_v2: Updating device list from the cib: modify lrm_rsc_op[@id='clvmd_monitor_30000'] > Feb 05 15:09:12 [2718] virt-196 stonith-ng: info: cib_devices_update: Updating devices to version 0.30.7545 it takes a long time to query cluster status (when it succeeds), stonith resource monitor operations keep timing out, GFS2 resources are very slow to start -- cluster is unusable: > [root@virt-196 ~]# pcs status > Cluster name: STSRHTS4714 > > WARNINGS: > Critical: Unable to get stonith-history > > Connection to the cluster-daemons terminated > Reading stonith-history failed > Daemon Status: > corosync: active/enabled > pacemaker: active/enabled > pcsd: active/enabled > [root@virt-196 ~]# pcs status > Cluster name: STSRHTS4714 > Stack: corosync > Current DC: virt-196 (version 1.1.20-5.el7-3c4c782f70) - partition with quorum > Last updated: Wed Feb 5 15:17:02 2020 > Last change: Wed Feb 5 11:44:51 2020 by root via cibadmin on virt-191 > > 16 nodes configured > 864 resources configured > > Online: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > > Full list of resources: > > fence-virt-189 (stonith:fence_xvm): Started virt-190 (Monitoring) > fence-virt-190 (stonith:fence_xvm): FAILED virt-189 > fence-virt-191 (stonith:fence_xvm): FAILED virt-191 > fence-virt-192 (stonith:fence_xvm): FAILED virt-193 > fence-virt-193 (stonith:fence_xvm): FAILED virt-194 > fence-virt-194 (stonith:fence_xvm): FAILED virt-203 > fence-virt-195 (stonith:fence_xvm): FAILED virt-196 > fence-virt-196 (stonith:fence_xvm): FAILED virt-195 > fence-virt-197 (stonith:fence_xvm): FAILED virt-202 > fence-virt-202 (stonith:fence_xvm): FAILED virt-209 > fence-virt-203 (stonith:fence_xvm): FAILED virt-212 > fence-virt-204 (stonith:fence_xvm): FAILED virt-197 > fence-virt-209 (stonith:fence_xvm): FAILED virt-204 > fence-virt-210 (stonith:fence_xvm): FAILED virt-210 > fence-virt-211 (stonith:fence_xvm): FAILED virt-211 > fence-virt-212 (stonith:fence_xvm): FAILED virt-192 > Clone Set: dlm-clone [dlm] > Started: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: clvmd-clone [clvmd] > Started: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: vg_shared-clone [vg_shared] > Started: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol0-clone [fs-gfs2-shared-lvol0] > Started: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol1-clone [fs-gfs2-shared-lvol1] > Started: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol2-clone [fs-gfs2-shared-lvol2] > Started: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol3-clone [fs-gfs2-shared-lvol3] > Started: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol4-clone [fs-gfs2-shared-lvol4] > Started: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol5-clone [fs-gfs2-shared-lvol5] > Started: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol6-clone [fs-gfs2-shared-lvol6] > Started: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol7-clone [fs-gfs2-shared-lvol7] > Started: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol8-clone [fs-gfs2-shared-lvol8] > Started: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol9-clone [fs-gfs2-shared-lvol9] > Started: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol10-clone [fs-gfs2-shared-lvol10] > Started: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol11-clone [fs-gfs2-shared-lvol11] > Started: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Stopped: [ virt-194 virt-195 virt-196 virt-197 virt-202 ] > Clone Set: fs-gfs2-shared-lvol12-clone [fs-gfs2-shared-lvol12] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol13-clone [fs-gfs2-shared-lvol13] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol14-clone [fs-gfs2-shared-lvol14] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol15-clone [fs-gfs2-shared-lvol15] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol16-clone [fs-gfs2-shared-lvol16] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol17-clone [fs-gfs2-shared-lvol17] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol18-clone [fs-gfs2-shared-lvol18] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol19-clone [fs-gfs2-shared-lvol19] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol20-clone [fs-gfs2-shared-lvol20] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol21-clone [fs-gfs2-shared-lvol21] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol22-clone [fs-gfs2-shared-lvol22] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol23-clone [fs-gfs2-shared-lvol23] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol24-clone [fs-gfs2-shared-lvol24] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol25-clone [fs-gfs2-shared-lvol25] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol26-clone [fs-gfs2-shared-lvol26] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol27-clone [fs-gfs2-shared-lvol27] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol28-clone [fs-gfs2-shared-lvol28] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol29-clone [fs-gfs2-shared-lvol29] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol30-clone [fs-gfs2-shared-lvol30] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol31-clone [fs-gfs2-shared-lvol31] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol32-clone [fs-gfs2-shared-lvol32] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol33-clone [fs-gfs2-shared-lvol33] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol34-clone [fs-gfs2-shared-lvol34] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol35-clone [fs-gfs2-shared-lvol35] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol36-clone [fs-gfs2-shared-lvol36] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol37-clone [fs-gfs2-shared-lvol37] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol38-clone [fs-gfs2-shared-lvol38] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol39-clone [fs-gfs2-shared-lvol39] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol40-clone [fs-gfs2-shared-lvol40] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol41-clone [fs-gfs2-shared-lvol41] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol42-clone [fs-gfs2-shared-lvol42] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol43-clone [fs-gfs2-shared-lvol43] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol44-clone [fs-gfs2-shared-lvol44] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol45-clone [fs-gfs2-shared-lvol45] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol46-clone [fs-gfs2-shared-lvol46] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol47-clone [fs-gfs2-shared-lvol47] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol48-clone [fs-gfs2-shared-lvol48] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > Clone Set: fs-gfs2-shared-lvol49-clone [fs-gfs2-shared-lvol49] > Stopped: [ virt-189 virt-190 virt-191 virt-192 virt-193 virt-194 virt-195 virt-196 virt-197 virt-202 virt-203 virt-204 virt-209 virt-210 virt-211 virt-212 ] > > Failed Resource Actions: > * fence-virt-210_monitor_60000 on virt-210 'unknown error' (1): call=69, status=Error, exitreason='', > last-rc-change='Wed Feb 5 15:11:22 2020', queued=1ms, exec=253232ms > * fence-virt-211_monitor_60000 on virt-211 'unknown error' (1): call=337, status=Error, exitreason='', > last-rc-change='Wed Feb 5 15:11:19 2020', queued=0ms, exec=261114ms > * fence-virt-212_start_0 on virt-211 'unknown error' (1): call=66, status=Error, exitreason='', > last-rc-change='Wed Feb 5 15:04:17 2020', queued=0ms, exec=61179ms > * fence-virt-189_monitor_60000 on virt-190 'unknown error' (1): call=391, status=Error, exitreason='', > last-rc-change='Wed Feb 5 15:11:25 2020', queued=0ms, exec=256420ms > * fence-virt-191_monitor_60000 on virt-191 'unknown error' (1): call=384, status=Error, exitreason='', > last-rc-change='Wed Feb 5 15:10:30 2020', queued=0ms, exec=310208ms > * fence-virt-191_start_0 on virt-193 'unknown error' (1): call=275, status=Error, exitreason='', > last-rc-change='Wed Feb 5 14:14:19 2020', queued=0ms, exec=61121ms > * fence-virt-192_monitor_60000 on virt-193 'unknown error' (1): call=281, status=Error, exitreason='', > last-rc-change='Wed Feb 5 15:11:25 2020', queued=0ms, exec=258757ms > * fence-virt-196_monitor_60000 on virt-195 'unknown error' (1): call=351, status=Error, exitreason='', > last-rc-change='Wed Feb 5 15:11:21 2020', queued=0ms, exec=260942ms > * fence-virt-194_monitor_60000 on virt-203 'unknown error' (1): call=399, status=Error, exitreason='', > last-rc-change='Wed Feb 5 15:11:20 2020', queued=0ms, exec=264586ms > * fence-virt-209_monitor_60000 on virt-204 'unknown error' (1): call=341, status=Error, exitreason='', > last-rc-change='Wed Feb 5 15:11:21 2020', queued=0ms, exec=257685ms > * fence-virt-202_monitor_60000 on virt-209 'unknown error' (1): call=405, status=Error, exitreason='', > last-rc-change='Wed Feb 5 15:11:19 2020', queued=0ms, exec=266096ms > * fence-virt-203_monitor_60000 on virt-212 'unknown error' (1): call=351, status=Error, exitreason='', > last-rc-change='Wed Feb 5 15:11:21 2020', queued=0ms, exec=259148ms > * fence-virt-189_start_0 on virt-189 'unknown error' (1): call=262, status=Error, exitreason='', > last-rc-change='Wed Feb 5 14:14:19 2020', queued=0ms, exec=61175ms > * fence-virt-190_monitor_60000 on virt-189 'unknown error' (1): call=268, status=Error, exitreason='', > last-rc-change='Wed Feb 5 15:11:19 2020', queued=0ms, exec=259949ms > * fence-virt-212_monitor_60000 on virt-192 'unknown error' (1): call=68, status=Error, exitreason='', > last-rc-change='Wed Feb 5 15:11:28 2020', queued=0ms, exec=251397ms > * fence-virt-193_monitor_60000 on virt-194 'unknown error' (1): call=352, status=Error, exitreason='', > last-rc-change='Wed Feb 5 15:10:30 2020', queued=0ms, exec=313497ms > * fence-virt-189_start_0 on virt-196 'unknown error' (1): call=356, status=Error, exitreason='', > last-rc-change='Wed Feb 5 13:45:01 2020', queued=0ms, exec=61190ms > * fence-virt-191_start_0 on virt-196 'unknown error' (1): call=423, status=Error, exitreason='', > last-rc-change='Wed Feb 5 14:25:24 2020', queued=0ms, exec=61133ms > * fence-virt-190_start_0 on virt-196 'unknown error' (1): call=371, status=Error, exitreason='', > last-rc-change='Wed Feb 5 13:48:43 2020', queued=0ms, exec=61213ms > * fence-virt-195_monitor_60000 on virt-196 'unknown error' (1): call=447, status=Error, exitreason='', > last-rc-change='Wed Feb 5 15:11:20 2020', queued=0ms, exec=263059ms > * fence-virt-204_monitor_60000 on virt-197 'unknown error' (1): call=496, status=Error, exitreason='', > last-rc-change='Wed Feb 5 15:11:21 2020', queued=0ms, exec=254069ms > * fence-virt-211_start_0 on virt-202 'unknown error' (1): call=390, status=Error, exitreason='', > last-rc-change='Wed Feb 5 14:14:19 2020', queued=0ms, exec=61118ms > * fence-virt-191_start_0 on virt-202 'unknown error' (1): call=350, status=Error, exitreason='', > last-rc-change='Wed Feb 5 13:48:43 2020', queued=0ms, exec=61144ms > * fence-virt-197_monitor_60000 on virt-202 'unknown error' (1): call=418, status=Error, exitreason='', > last-rc-change='Wed Feb 5 15:11:20 2020', queued=0ms, exec=261480ms > > Daemon Status: > corosync: active/enabled > pacemaker: active/enabled > pcsd: active/enabled result: large GFS2 cluster on 16 VM nodes is unbearably slow (managed to bring up only 10 clone groups in 3h30m) and unreliable (nodes getting fenced, presumably due to stonith resource monitor operation timeouts). after (1.1.21-4.el7) ==================== result: whole test case successfully finished in under 40 minutes without any fencing or other issues (test time includes verification of operation -- all resources were started in just 18 min) marking verified in 1.1.21-4.el7 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:1032 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |