Bug 2177996
Summary: | Need a way to add a scsi fencing device to a cluster without requiring a restart of all cluster resources | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | wilson.hua | ||||||||||||||
Component: | pcs | Assignee: | Miroslav Lisik <mlisik> | ||||||||||||||
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||||||||||||
Severity: | urgent | Docs Contact: | Steven J. Levine <slevine> | ||||||||||||||
Priority: | urgent | ||||||||||||||||
Version: | 9.1 | CC: | cfeist, cluster-maint, idevat, jwboyer, kgaillot, mlisik, mmazoure, mpospisi, nhostako, omular, sbradley, slevine, tojeline, vincent.chen1 | ||||||||||||||
Target Milestone: | rc | Keywords: | Regression, Triaged, ZStream | ||||||||||||||
Target Release: | 9.3 | Flags: | wilson.hua:
needinfo?
pm-rhel: mirror+ |
||||||||||||||
Hardware: | Unspecified | ||||||||||||||||
OS: | Linux | ||||||||||||||||
Whiteboard: | |||||||||||||||||
Fixed In Version: | pcs-0.11.5-1.el9 | Doc Type: | Bug Fix | ||||||||||||||
Doc Text: |
.`pcs` command to update multipath SCSI devices now works correctly
Due to changes in the Pacemaker CIB file, the `pcs stonith update-scsi-devices` command stopped working as designed, causing an unwanted restart of some cluster resources. With this fix, this command works correctly and updates SCSI devices without requiring a restart of other cluster resources running on the same node.
|
Story Points: | --- | ||||||||||||||
Clone Of: | |||||||||||||||||
: | 2179010 2180704 2180705 (view as bug list) | Environment: | |||||||||||||||
Last Closed: | 2023-11-07 08:23:11 UTC | Type: | Bug | ||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
Embargoed: | |||||||||||||||||
Bug Depends On: | |||||||||||||||||
Bug Blocks: | 2179010, 2180704, 2180705 | ||||||||||||||||
Deadline: | 2023-05-29 | ||||||||||||||||
Attachments: |
|
Description
wilson.hua
2023-03-14 07:55:43 UTC
Ken, could you look at this? It does not work with latest pacemaker-2.1.5-7.el9 in RHEL-9.3. The last working version is pacemaker-2.1.4-2.el9 in RHEL-9.1. Is it possible that something has changed in pacemaker-2.1.4-3.el9 regarding digests calculation? (bz1872376) (In reply to Miroslav Lisik from comment #1) > Ken, could you look at this? > > It does not work with latest pacemaker-2.1.5-7.el9 in RHEL-9.3. The last > working version is pacemaker-2.1.4-2.el9 in RHEL-9.1. Is it possible > that something has changed in pacemaker-2.1.4-3.el9 regarding digests > calculation? (bz1872376) Not that I know of, and our regression test for it passes. Can you give me the CIB and commands you're using to test? Here are commands: export disk1=/dev/disk/by-id/scsi-SLIO-ORG_r91-disk-01_7ad95d75-3cf3-448e-a591-42b9ba690b22 export disk2=/dev/disk/by-id/scsi-SLIO-ORG_r91-disk-02_e9a0c17d-c631-41cf-a135-e3453ce0c501 pcs host auth -u hacluster -p password r91-1 r91-2 pcs cluster setup HACluster r91-1 r91-2 --start --wait pcs stonith create fence-scsi fence_scsi devices=$disk1 pcmk_host_check=static-list 'pcmk_host_list=r91-1 r91-2' pcmk_reboot_action=off meta provides=unfencing pcs resource create d1 ocf:pacemaker:Dummy pcs resource create d2 ocf:pacemaker:Dummy pcs cluster cib > cib_before.xml pcs stonith update-scsi-devices fence-scsi add $disk2 pcs cluster cib > cib_after.xml Logs after executing 'update-scsi-devices' command folllows. With latest rhel-9.1.z versions, see cib_before.xml, cib_after.xml: [root@r91-1 ~]# rpm -q pcs pacemaker pcs-0.11.3-4.el9_1.2.x86_64 pacemaker-2.1.4-5.el9_1.2.x86_64 [root@r91-1 ~]# journalctl -n0 -f Mar 14 17:36:51 r91-1 pacemaker-fenced[1663]: notice: Added 'fence-scsi' to device list (1 active device) Mar 14 17:36:51 r91-1 pacemaker-controld[1667]: notice: Requesting local execution of stop operation for fence-scsi on r91-1 Mar 14 17:36:51 r91-1 pacemaker-controld[1667]: notice: Result of stop operation for fence-scsi on r91-1: ok Mar 14 17:36:51 r91-1 pacemaker-controld[1667]: notice: Requesting local execution of stop operation for d2 on r91-1 Mar 14 17:36:51 r91-1 pacemaker-controld[1667]: notice: Result of stop operation for d2 on r91-1: ok Mar 14 17:36:51 r91-1 pacemaker-fenced[1663]: notice: fence-scsi is eligible to fence (on) r91-1: static-list Mar 14 17:36:51 r91-1 pacemaker-fenced[1663]: notice: fence-scsi is eligible to fence (on) r91-1: static-list Mar 14 17:36:51 r91-1 pacemaker-fenced[1663]: notice: Operation 'on' [2379] targeting r91-1 using fence-scsi returned 0 Mar 14 17:36:51 r91-1 pacemaker-fenced[1663]: notice: Operation 'on' targeting r91-1 by r91-1 for pacemaker-controld.1631@r91-2: OK (complete) Mar 14 17:36:51 r91-1 pacemaker-controld[1667]: notice: r91-1 was unfenced by r91-1 at the request of pacemaker-controld.1631@r91-2 Mar 14 17:36:51 r91-1 pacemaker-attrd[1665]: notice: Setting #node-unfenced[r91-1]: 1678811397 -> 1678811811 Mar 14 17:36:51 r91-1 pacemaker-attrd[1665]: notice: Setting #digests-all[r91-1]: fence-scsi:fence_scsi:ec9ecba84b274fb35effbf2a47226087, -> fence-scsi:fence_scsi:41f4daa097914fe0b3f6ba8363f28cf9, Mar 14 17:36:51 r91-1 pacemaker-attrd[1665]: notice: Setting #digests-secure[r91-1]: fence-scsi:fence_scsi:8a7469d06699bb5cdf0da9304affaf6e, -> fence-scsi:fence_scsi:2467afb1330d3c0048abdc371dea6bc3, Mar 14 17:36:51 r91-1 pacemaker-fenced[1663]: notice: Operation 'on' targeting r91-2 by r91-2 for pacemaker-controld.1631@r91-2: OK (complete) Mar 14 17:36:51 r91-1 pacemaker-controld[1667]: notice: r91-2 was unfenced by r91-2 at the request of pacemaker-controld.1631@r91-2 Mar 14 17:36:51 r91-1 pacemaker-attrd[1665]: notice: Setting #node-unfenced[r91-2]: 1678811397 -> 1678811811 Mar 14 17:36:51 r91-1 pacemaker-attrd[1665]: notice: Setting #digests-all[r91-2]: fence-scsi:fence_scsi:ec9ecba84b274fb35effbf2a47226087, -> fence-scsi:fence_scsi:41f4daa097914fe0b3f6ba8363f28cf9, Mar 14 17:36:51 r91-1 pacemaker-attrd[1665]: notice: Setting #digests-secure[r91-2]: fence-scsi:fence_scsi:8a7469d06699bb5cdf0da9304affaf6e, -> fence-scsi:fence_scsi:2467afb1330d3c0048abdc371dea6bc3, Mar 14 17:36:51 r91-1 pacemaker-controld[1667]: notice: Requesting local execution of start operation for d2 on r91-1 Mar 14 17:36:51 r91-1 pacemaker-controld[1667]: notice: Result of start operation for d2 on r91-1: ok Mar 14 17:36:51 r91-1 pacemaker-controld[1667]: notice: Requesting local execution of start operation for fence-scsi on r91-1 Mar 14 17:36:51 r91-1 pacemaker-controld[1667]: notice: Result of start operation for fence-scsi on r91-1: ok Mar 14 17:36:51 r91-1 pacemaker-controld[1667]: notice: Requesting local execution of monitor operation for fence-scsi on r91-1 Mar 14 17:36:51 r91-1 pacemaker-controld[1667]: notice: Requesting local execution of monitor operation for d2 on r91-1 Mar 14 17:36:51 r91-1 pacemaker-controld[1667]: notice: Result of monitor operation for d2 on r91-1: ok Mar 14 17:36:51 r91-1 pacemaker-controld[1667]: notice: Result of monitor operation for fence-scsi on r91-1: ok [root@r91-2 ~]# journalctl -n0 -f Mar 14 17:36:38 r91-2 systemd[1]: systemd-hostnamed.service: Deactivated successfully. Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: State transition S_IDLE -> S_POLICY_ENGINE Mar 14 17:36:51 r91-2 pacemaker-fenced[1627]: notice: Added 'fence-scsi' to device list (1 active device) Mar 14 17:36:51 r91-2 pacemaker-schedulerd[1630]: notice: Unfencing r91-2 (remote): because the definition of fence-scsi changed Mar 14 17:36:51 r91-2 pacemaker-schedulerd[1630]: notice: Unfencing r91-1 (remote): because the definition of fence-scsi changed Mar 14 17:36:51 r91-2 pacemaker-schedulerd[1630]: notice: * Fence (on) r91-1 Mar 14 17:36:51 r91-2 pacemaker-schedulerd[1630]: notice: * Fence (on) r91-2 Mar 14 17:36:51 r91-2 pacemaker-schedulerd[1630]: notice: Actions: Restart fence-scsi ( r91-1 ) due to required stonith Mar 14 17:36:51 r91-2 pacemaker-schedulerd[1630]: notice: Actions: Restart d1 ( r91-2 ) due to required stonith Mar 14 17:36:51 r91-2 pacemaker-schedulerd[1630]: notice: Actions: Restart d2 ( r91-1 ) due to required stonith Mar 14 17:36:51 r91-2 pacemaker-schedulerd[1630]: notice: Calculated transition 5, saving inputs in /var/lib/pacemaker/pengine/pe-input-205.bz2 Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Initiating stop operation fence-scsi_stop_0 on r91-1 Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Initiating stop operation d1_stop_0 locally on r91-2 Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Requesting local execution of stop operation for d1 on r91-2 Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Initiating stop operation d2_stop_0 on r91-1 Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Result of stop operation for d1 on r91-2: ok Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Requesting fencing (on) of node r91-2 Mar 14 17:36:51 r91-2 pacemaker-fenced[1627]: notice: Client pacemaker-controld.1631 wants to fence (on) r91-2 using any device Mar 14 17:36:51 r91-2 pacemaker-fenced[1627]: notice: Requesting peer fencing (on) targeting r91-2 Mar 14 17:36:51 r91-2 pacemaker-fenced[1627]: notice: fence-scsi is eligible to fence (on) r91-2: static-list Mar 14 17:36:51 r91-2 pacemaker-fenced[1627]: notice: Requesting that r91-2 perform 'on' action targeting r91-2 Mar 14 17:36:51 r91-2 pacemaker-fenced[1627]: notice: fence-scsi is eligible to fence (on) r91-2: static-list Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Requesting fencing (on) of node r91-1 Mar 14 17:36:51 r91-2 pacemaker-fenced[1627]: notice: Client pacemaker-controld.1631 wants to fence (on) r91-1 using any device Mar 14 17:36:51 r91-2 pacemaker-fenced[1627]: notice: Requesting peer fencing (on) targeting r91-1 Mar 14 17:36:51 r91-2 pacemaker-fenced[1627]: notice: Requesting that r91-1 perform 'on' action targeting r91-1 Mar 14 17:36:51 r91-2 pacemaker-fenced[1627]: notice: Operation 'on' targeting r91-1 by r91-1 for pacemaker-controld.1631@r91-2: OK (complete) Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Fence operation 10 for r91-1 passed Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: r91-1 was unfenced by r91-1 at the request of pacemaker-controld.1631@r91-2 Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Initiating start operation fence-scsi_start_0 on r91-1 Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Initiating start operation d2_start_0 on r91-1 Mar 14 17:36:51 r91-2 pacemaker-attrd[1629]: notice: Setting #node-unfenced[r91-1]: 1678811397 -> 1678811811 Mar 14 17:36:51 r91-2 pacemaker-attrd[1629]: notice: Setting #digests-all[r91-1]: fence-scsi:fence_scsi:ec9ecba84b274fb35effbf2a47226087, -> fence-scsi:fence_scsi:41f4daa097914fe0b3f6ba8363f28cf9, Mar 14 17:36:51 r91-2 pacemaker-attrd[1629]: notice: Setting #digests-secure[r91-1]: fence-scsi:fence_scsi:8a7469d06699bb5cdf0da9304affaf6e, -> fence-scsi:fence_scsi:2467afb1330d3c0048abdc371dea6bc3, Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Transition 5 aborted by status-1-.node-unfenced doing modify #node-unfenced=1678811811: Transient attribute change Mar 14 17:36:51 r91-2 pacemaker-fenced[1627]: notice: Operation 'on' [2266] targeting r91-2 using fence-scsi returned 0 Mar 14 17:36:51 r91-2 pacemaker-fenced[1627]: notice: Operation 'on' targeting r91-2 by r91-2 for pacemaker-controld.1631@r91-2: OK (complete) Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Fence operation 9 for r91-2 passed Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: r91-2 was unfenced by r91-2 at the request of pacemaker-controld.1631@r91-2 Mar 14 17:36:51 r91-2 pacemaker-attrd[1629]: notice: Setting #node-unfenced[r91-2]: 1678811397 -> 1678811811 Mar 14 17:36:51 r91-2 pacemaker-attrd[1629]: notice: Setting #digests-all[r91-2]: fence-scsi:fence_scsi:ec9ecba84b274fb35effbf2a47226087, -> fence-scsi:fence_scsi:41f4daa097914fe0b3f6ba8363f28cf9, Mar 14 17:36:51 r91-2 pacemaker-attrd[1629]: notice: Setting #digests-secure[r91-2]: fence-scsi:fence_scsi:8a7469d06699bb5cdf0da9304affaf6e, -> fence-scsi:fence_scsi:2467afb1330d3c0048abdc371dea6bc3, Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Transition 5 (Complete=7, Pending=0, Fired=0, Skipped=3, Incomplete=4, Source=/var/lib/pacemaker/pengine/pe-input-205.bz2): Stopped Mar 14 17:36:51 r91-2 pacemaker-schedulerd[1630]: notice: Actions: Start d1 ( r91-2 ) Mar 14 17:36:51 r91-2 pacemaker-schedulerd[1630]: notice: Calculated transition 6, saving inputs in /var/lib/pacemaker/pengine/pe-input-206.bz2 Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Initiating monitor operation fence-scsi_monitor_60000 on r91-1 Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Initiating start operation d1_start_0 locally on r91-2 Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Initiating monitor operation d2_monitor_10000 on r91-1 Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Requesting local execution of start operation for d1 on r91-2 Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Result of start operation for d1 on r91-2: ok Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Initiating monitor operation d1_monitor_10000 locally on r91-2 Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Requesting local execution of monitor operation for d1 on r91-2 Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Result of monitor operation for d1 on r91-2: ok Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: Transition 6 (Complete=4, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-206.bz2): Complete Mar 14 17:36:51 r91-2 pacemaker-controld[1631]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE With the last working version, see cib_before2.xml, cib_after2.xml: [root@r91-1 ~]# rpm -q pcs pacemaker pcs-0.11.3-4.el9_1.2.x86_64 pacemaker-2.1.4-2.el9.x86_64 [root@r91-1 ~]# journalctl -n0 -f Mar 14 17:53:30 r91-1 pacemaker-fenced[4320]: notice: Added 'fence-scsi' to device list (1 active device) [root@r91-2 ~]# journalctl -n0 -f Mar 14 17:53:30 r91-2 pacemaker-controld[4061]: notice: State transition S_IDLE -> S_POLICY_ENGINE Mar 14 17:53:30 r91-2 pacemaker-fenced[4057]: notice: Added 'fence-scsi' to device list (1 active device) Mar 14 17:53:30 r91-2 pacemaker-schedulerd[4060]: notice: Calculated transition 5, saving inputs in /var/lib/pacemaker/pengine/pe-input-215.bz2 Mar 14 17:53:30 r91-2 pacemaker-controld[4061]: notice: Transition 5 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-215.bz2): Complete Mar 14 17:53:30 r91-2 pacemaker-controld[4061]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE Created attachment 1950691 [details]
cib file
Created attachment 1950692 [details]
cib_after.xml
Created attachment 1950693 [details]
cib_before2.xml
Created attachment 1950694 [details]
cib_after2.xml
Created attachment 1950695 [details]
cib_before.xml
I need to investigate some more, but I think this is not a new problem, but an expansion of a case we neglected with the original bz. We're changing the digests in the resource history, but there is a separate copy of the digest stored in the "#digests-all" and "#digest-secure" node attributes, used to detect whether a node needs to be re-unfenced after a change in stonith device definition. Before 2.1.4-3, this only had an effect on Pacemaker Remote nodes, which is why we missed it. Since 2.1.4-3, it applies to all nodes. That was an important fix, so I think the only way around this will be for pcs to replace those node attribute values in addition to the digests in the resource history. The node attribute values look like: "stonith-fence_compute-fence-nova:fence_compute:ad312d85623cdb0a792e6fbd5e91a820," That is a comma-separated list of DEVICE_NAME:AGENT_NAME:DIGEST for each stonith device that has unfenced the node. "#digests-all" uses the all-parameter digest, and "#digest-secure" (which is only relevant to crm_simulate) uses the non-private-parameter digest. It's possible that's not the only issue, but that's definitely part of it. Dear Team, May I know the current progress of this issue? Thank you! Hello Wilson Hua, Engineering team is currently working on the fix for this issue. Pcs packages incorporating the fix are expected to be built and passed to QA for testing during next week. Regards, Tomas Jelinek Ken: Is there a third digest attribute which uses nonreloadable-parameter digest or there are only two of them - "#digests-all" and "#digests-secure"? (In reply to Miroslav Lisik from comment #20) > Ken: > > Is there a third digest attribute which uses nonreloadable-parameter digest > or there are only two of them - "#digests-all" and "#digests-secure"? Only two, because this solely determines whether to unfence the node, not whether to restart or reload the device Upstream commit: https://github.com/ClusterLabs/pcs/commit/b18ba53144b7d2d5e435eab369cc1f2c0680a85f Updated commands: * pcs stonith update-scsi-devices Test: Setup cluster with a shared storage and fence_scsi fencing. Setup enough resources in order to have each node running some resource. Use `pcs stonith update-scsi-devices` command to modify scsi devices of the fence_scsi stonith device. Check that resources did not restart (journalctl, crm_rsource --list-operations) Just double confirm, this issue will be fixed in RHEL 9.3, right? When it is fixed, will there be any patch for RHEL 9.1, 9.2 and 8.8? As we know, the issue can be reproduced on those versions too (In reply to wilson.hua from comment #24) > Just double confirm, this issue will be fixed in RHEL 9.3, right? > When it is fixed, will there be any patch for RHEL 9.1, 9.2 and 8.8? As we know, the issue can be reproduced on those versions too This issue will be fixed in 9.3, 9.2, 8.9 and 8.8. It won't be fixed in 9.1 and 8.7 as those releases approach their End of Life. DevTestResults: [root@r09-03-a ~]# rpm -q pcs pcs-0.11.5-1.el9.x86_64 (pcs) [root@r09-03-a pcs]# pcs_test/suite --installed --traditional-verbose pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi ... pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_1_monitor_with_timeout (subunit.RemotedTestCase) pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_1_monitor_with_timeout ... OK pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_1_nonrecurring_start_op_with_timeout (subunit.RemotedTestCase) pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_1_nonrecurring_start_op_with_timeout ... OK pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_2_monitor_ops_with_one_timeout (subunit.RemotedTestCase) pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_2_monitor_ops_with_one_timeout ... OK pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_default_monitor (subunit.RemotedTestCase) pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_default_monitor ... OK pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_2_monitor_ops_with_timeouts (subunit.RemotedTestCase) pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_2_monitor_ops_with_timeouts ... OK pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_no_monitor_ops (subunit.RemotedTestCase) pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_no_monitor_ops ... OK pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_all_nodes_multi_value (subunit.RemotedTestCase) pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_all_nodes_multi_value ... OK pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_all_nodes (subunit.RemotedTestCase) pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_all_nodes ... OK pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_digests_with_empty_value (subunit.RemotedTestCase) pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_digests_with_empty_value ... OK pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_not_all_digest_types (subunit.RemotedTestCase) pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_not_all_digest_types ... OK pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_no_digest_for_our_stonith_id (subunit.RemotedTestCase) pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_no_digest_for_our_stonith_id ... OK pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_without_digests_attrs (subunit.RemotedTestCase) pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_without_digests_attrs ... OK pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_not_on_all_nodes (subunit.RemotedTestCase) pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_not_on_all_nodes ... OK pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_not_on_all_nodes_multi_value (subunit.RemotedTestCase) pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_not_on_all_nodes_multi_value ... OK pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_various_start_ops_one_lrm_start_op (subunit.RemotedTestCase) pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_various_start_ops_one_lrm_start_op ... OK pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_without_last_comma (subunit.RemotedTestCase) pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_without_last_comma ... OK pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_without_last_comma_multi_value (subunit.RemotedTestCase) pcs_test.tier0.lib.commands.test_stonith_update_scsi_devices.TestUpdateScsiDevicesDigestsSetScsi.test_transient_digests_attrs_without_last_comma_multi_value ... OK ---------------------------------------------------------------------- Ran 17 tests in 0.090s OK [root@r90-node-01 ~]# cat /etc/redhat-release Red Hat Enterprise Linux release 9.3 Beta (Plow) [root@r90-node-01 ~]# rpm -q pcs pacemaker pcs-0.11.5-1.el9.x86_64 pacemaker-2.1.6-1.el9.x86_64 Additional testing on real cluster by mlisik: [root@r90-node-01 ~]# export disk1=/dev/disk/by-id/scsi-36001405ab8c8a45d1794808a8872f1c2 [root@r90-node-01 ~]# export disk2=/dev/disk/by-id/scsi-36001405c1cf9f31e16e49b6942bf60c7 [root@r90-node-01 ~]# export NODELIST=(r90-node-01 r90-node-02) [root@r90-node-01 ~]# pcs cluster destroy --all Warning: It is recommended to run 'pcs cluster stop' before destroying the cluster. WARNING: This would kill all cluster processes and then PERMANENTLY remove cluster state and configuration Type 'yes' or 'y' to proceed, anything else to cancel: y Warning: Unable to load CIB to get guest and remote nodes from it, those nodes will not be deconfigured. r90-node-01: Stopping Cluster (pacemaker)... r90-node-02: Stopping Cluster (pacemaker)... r90-node-01: Successfully destroyed cluster r90-node-02: Successfully destroyed cluster [root@r90-node-01 ~]# pcs host auth -u hacluster -p password ${NODELIST[*]} r90-node-01: Authorized r90-node-02: Authorized [root@r90-node-01 ~]# pcs cluster setup HACluster ${NODELIST[*]} --start --wait No addresses specified for host 'r90-node-01', using 'r90-node-01' No addresses specified for host 'r90-node-02', using 'r90-node-02' Destroying cluster on hosts: 'r90-node-01', 'r90-node-02'... r90-node-01: Successfully destroyed cluster r90-node-02: Successfully destroyed cluster Requesting remove 'pcsd settings' from 'r90-node-01', 'r90-node-02' r90-node-01: successful removal of the file 'pcsd settings' r90-node-02: successful removal of the file 'pcsd settings' Sending 'corosync authkey', 'pacemaker authkey' to 'r90-node-01', 'r90-node-02' r90-node-01: successful distribution of the file 'corosync authkey' r90-node-01: successful distribution of the file 'pacemaker authkey' r90-node-02: successful distribution of the file 'corosync authkey' r90-node-02: successful distribution of the file 'pacemaker authkey' Sending 'corosync.conf' to 'r90-node-01', 'r90-node-02' r90-node-01: successful distribution of the file 'corosync.conf' r90-node-02: successful distribution of the file 'corosync.conf' Cluster has been successfully set up. Starting cluster on hosts: 'r90-node-01', 'r90-node-02'... Waiting for node(s) to start: 'r90-node-01', 'r90-node-02'... r90-node-02: Cluster started r90-node-01: Cluster started [root@r90-node-01 ~]# pcs stonith create fence-scsi fence_scsi devices=$disk1 pcmk_host_check=static-list pcmk_host_list="${NODELIST[*]}" pcmk_reboot_action=off meta provides=unfencing [root@r90-node-01 ~]# for i in $(seq 1 ${#NODELIST[@]}); do pcs resource create "d$i" ocf:pacemaker:Dummy; done [root@r90-node-01 ~]# pcs resource * d1 (ocf:pacemaker:Dummy): Started r90-node-02 * d2 (ocf:pacemaker:Dummy): Started r90-node-01 [root@r90-node-01 ~]# pcs stonith * fence-scsi (stonith:fence_scsi): Started r90-node-01 [root@r90-node-01 ~]# pcs stonith config Resource: fence-scsi (class=stonith type=fence_scsi) Attributes: fence-scsi-instance_attributes devices=/dev/disk/by-id/scsi-36001405ab8c8a45d1794808a8872f1c2 pcmk_host_check=static-list pcmk_host_list="r90-node-01 r90-node-02" pcmk_reboot_action=off Meta Attributes: fence-scsi-meta_attributes provides=unfencing Operations: monitor: fence-scsi-monitor-interval-60s interval=60s [root@r90-node-01 ~]# for r in fence-scsi d1 d2; do crm_resource --resource $r --list-operations; done |& tee o1.txt fence-scsi (stonith:fence_scsi): Started: fence-scsi_start_0 (node=r90-node-01, call=6, rc=0, last-rc-change='Thu May 25 13:44:21 2023', exec=64ms): complete fence-scsi (stonith:fence_scsi): Started: fence-scsi_monitor_60000 (node=r90-node-01, call=7, rc=0, last-rc-change='Thu May 25 13:44:21 2023', exec=66ms): complete fence-scsi (stonith:fence_scsi): Started: fence-scsi_monitor_0 (node=r90-node-02, call=5, rc=7, last-rc-change='Thu May 25 13:44:21 2023', exec=3ms): complete d1 (ocf:pacemaker:Dummy): Started: d1_monitor_0 (node=r90-node-01, call=11, rc=7, last-rc-change='Thu May 25 13:44:56 2023', exec=29ms): complete d1 (ocf:pacemaker:Dummy): Started: d1_start_0 (node=r90-node-02, call=10, rc=0, last-rc-change='Thu May 25 13:44:56 2023', exec=16ms): complete d1 (ocf:pacemaker:Dummy): Started: d1_monitor_10000 (node=r90-node-02, call=11, rc=0, last-rc-change='Thu May 25 13:44:56 2023', exec=13ms): complete d2 (ocf:pacemaker:Dummy): Started: d2_start_0 (node=r90-node-01, call=16, rc=0, last-rc-change='Thu May 25 13:44:57 2023', exec=13ms): complete d2 (ocf:pacemaker:Dummy): Started: d2_monitor_10000 (node=r90-node-01, call=17, rc=0, last-rc-change='Thu May 25 13:44:57 2023', exec=11ms): complete d2 (ocf:pacemaker:Dummy): Started: d2_monitor_0 (node=r90-node-02, call=15, rc=7, last-rc-change='Thu May 25 13:44:57 2023', exec=17ms): complete [root@r90-node-01 ~]# pcs stonith update-scsi-devices fence-scsi add $disk2 [root@r90-node-01 ~]# for r in fence-scsi d1 d2; do crm_resource --resource $r --list-operations; done |& tee o2.txt fence-scsi (stonith:fence_scsi): Started: fence-scsi_start_0 (node=r90-node-01, call=6, rc=0, last-rc-change='Thu May 25 13:44:21 2023', exec=64ms): complete fence-scsi (stonith:fence_scsi): Started: fence-scsi_monitor_60000 (node=r90-node-01, call=7, rc=0, last-rc-change='Thu May 25 13:44:21 2023', exec=66ms): complete fence-scsi (stonith:fence_scsi): Started: fence-scsi_monitor_0 (node=r90-node-02, call=5, rc=7, last-rc-change='Thu May 25 13:44:21 2023', exec=3ms): complete d1 (ocf:pacemaker:Dummy): Started: d1_monitor_0 (node=r90-node-01, call=11, rc=7, last-rc-change='Thu May 25 13:44:56 2023', exec=29ms): complete d1 (ocf:pacemaker:Dummy): Started: d1_start_0 (node=r90-node-02, call=10, rc=0, last-rc-change='Thu May 25 13:44:56 2023', exec=16ms): complete d1 (ocf:pacemaker:Dummy): Started: d1_monitor_10000 (node=r90-node-02, call=11, rc=0, last-rc-change='Thu May 25 13:44:56 2023', exec=13ms): complete d2 (ocf:pacemaker:Dummy): Started: d2_start_0 (node=r90-node-01, call=16, rc=0, last-rc-change='Thu May 25 13:44:57 2023', exec=13ms): complete d2 (ocf:pacemaker:Dummy): Started: d2_monitor_10000 (node=r90-node-01, call=17, rc=0, last-rc-change='Thu May 25 13:44:57 2023', exec=11ms): complete d2 (ocf:pacemaker:Dummy): Started: d2_monitor_0 (node=r90-node-02, call=15, rc=7, last-rc-change='Thu May 25 13:44:57 2023', exec=17ms): complete [root@r90-node-01 ~]# diff -u o1.txt o2.txt [root@r90-node-01 ~]# echo $? 0 [root@r90-node-01 ~]# journalctl -n 0 -f May 25 13:48:02 r90-node-01 pacemaker-controld[6150]: notice: State transition S_IDLE -> S_POLICY_ENGINE May 25 13:48:02 r90-node-01 pacemaker-fenced[6146]: notice: Added 'fence-scsi' to device list (1 active device) May 25 13:48:02 r90-node-01 pacemaker-schedulerd[6149]: notice: Calculated transition 5, saving inputs in /var/lib/pacemaker/pengine/pe-input-2839.bz2 May 25 13:48:02 r90-node-01 pacemaker-controld[6150]: notice: Transition 5 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-2839.bz2): Complete May 25 13:48:02 r90-node-01 pacemaker-controld[6150]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE [root@r90-node-02 ~]# journalctl -n 0 -f May 25 13:48:02 r90-node-02 pacemaker-fenced[100644]: notice: Added 'fence-scsi' to device list (1 active device) [root@r90-node-01 ~]# pcs stonith config Resource: fence-scsi (class=stonith type=fence_scsi) Attributes: fence-scsi-instance_attributes devices=/dev/disk/by-id/scsi-36001405ab8c8a45d1794808a8872f1c2,/dev/disk/by-id/scsi-36001405c1cf9f31e16e49b6942bf60c7 pcmk_host_check=static-list pcmk_host_list="r90-node-01 r90-node-02" pcmk_reboot_action=off Meta Attributes: fence-scsi-meta_attributes provides=unfencing Operations: monitor: fence-scsi-monitor-interval-60s interval=60s RESULT: SCSI device was added to the stonith configuration and resources were not restarted. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Low: pcs security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:6316 |