Bug 2003066

Summary: update-scsi-devices command unfence a node without quorum [rhel-8.6.0]
Product: Red Hat Enterprise Linux 8 Reporter: RHEL Program Management Team <pgm-rhel-tools>
Component: pcsAssignee: Miroslav Lisik <mlisik>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 8.5CC: cluster-maint, idevat, kmalyjur, lmiksik, mlisik, mpospisi, nhostako, omular, sbradley, tojeline
Target Milestone: rcKeywords: Triaged
Target Release: 8.6   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pcs-0.10.11-1.el8 Doc Type: Bug Fix
Doc Text:
This bug was found during rhel-8.5 development/testing phase and was fixed there. Packages with the bug have never been released. This bz ensures that the bug is fixed in rhel-8.6. There is nothing to document for this bz.
Story Points: ---
Clone Of: 1991654 Environment:
Last Closed: 2022-05-10 14:50:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1991654    
Bug Blocks:    
Attachments:
Description Flags
proposed fix + tests none

Comment 1 Miroslav Lisik 2021-09-24 14:23:21 UTC
Created attachment 1825908 [details]
proposed fix + tests

Updated command:
* pcs stonith update-scsi-devices

Test:
* setup a cluster with a fence scsi stonith resource
* setup resources running on each node
* block corosync traffic on one cleuster node and wait until node is fenced
* add scsi devices by using command `pcs stonith update-scsi-devices add` or pcs stonith update-scsi-devices set`
* see result, which should be that devices are unfenced only on nodes which are note fenced and resources are not restarted.

Comment 4 Miroslav Lisik 2021-11-02 09:23:28 UTC
DevTestResults:

[root@r8-node-01 ~]# rpm -q pcs
pcs-0.10.11-1.el8.x86_64

Environment: Cluster with a fence_scsi stonith resource and resources running on each node.

[root@r8-node-01 ~]# pcs stonith
  * fence-scsi  (stonith:fence_scsi):    Started r8-node-01
[root@r8-node-01 ~]# pcs resource
  * d-01        (ocf::pacemaker:Dummy):  Started r8-node-02
  * d-02        (ocf::pacemaker:Dummy):  Started r8-node-03
  * d-03        (ocf::pacemaker:Dummy):  Started r8-node-01
  * d-04        (ocf::pacemaker:Dummy):  Started r8-node-02
  * d-05        (ocf::pacemaker:Dummy):  Started r8-node-03
  * d-06        (ocf::pacemaker:Dummy):  Started r8-node-01
[root@r8-node-01 ~]# echo $disk{1..3}
/dev/disk/by-id/scsi-360014052bc36324cf7d4a709a959340b /dev/disk/by-id/scsi-3600140547721f8ee2774aa8bac6d8ebe /dev/disk/by-id/scsi-360014052f8c6f3de01047c29b72040f4
[root@r8-node-01 ~]# for disk in $disk{1..3}; do sg_persist -n -i -k -d $disk; done
  PR generation=0x8, 3 registered reservation keys follow:
    0x14080002
    0x14080001
    0x14080000
  PR generation=0x6, there are NO registered reservation keys
  PR generation=0x0, there are NO registered reservation keys


### Block corosync traffic:

[root@r8-node-03 ~]# iptables -A INPUT ! -i lo -p udp --dport 5404 -j DROP && iptables -A INPUT ! -i lo -p udp --dport 5405 -j DROP && iptables -A OUTPUT ! -o lo -p udp --sport 5404 -j DROP && iptables -A OUTPUT ! -o lo -p udp --sport 5405 -j DROP
[root@r8-node-03 ~]# pcs status nodes
Pacemaker Nodes:
 Online: r8-node-03
 Standby:
 Standby with resource(s) running:
 Maintenance:
 Offline: r8-node-01 r8-node-02
Pacemaker Remote Nodes:
 Online:
 Standby:
 Standby with resource(s) running:
 Maintenance:
 Offline:


[root@r8-node-01 ~]# for disk in $disk{1..3}; do sg_persist -n -i -k -d $disk; done
  PR generation=0x9, 2 registered reservation keys follow:
    0x14080001
    0x14080000
  PR generation=0x6, there are NO registered reservation keys
  PR generation=0x0, there are NO registered reservation keys
[root@r8-node-01 ~]# pcs status nodes
Pacemaker Nodes:
 Online: r8-node-01 r8-node-02
 Standby:
 Standby with resource(s) running:
 Maintenance:
 Offline: r8-node-03
Pacemaker Remote Nodes:
 Online:
 Standby:
 Standby with resource(s) running:
 Maintenance:
 Offline:


### Add scsi devices

[root@r8-node-01 ~]# pcs stonith update-scsi-devices fence-scsi add $disk2 $disk3
r8-node-03: Unfencing skipped, device '/dev/disk/by-id/scsi-360014052bc36324cf7d4a709a959340b' is fenced
[root@r8-node-01 ~]# echo $?
0

### Check registration keys on the disks

[root@r8-node-01 ~]# for disk in $disk{1..3}; do sg_persist -n -i -k -d $disk; done
  PR generation=0x9, 2 registered reservation keys follow:
    0x14080001
    0x14080000
  PR generation=0x8, 2 registered reservation keys follow:
    0x14080001
    0x14080000
  PR generation=0x2, 2 registered reservation keys follow:
    0x14080001
    0x14080000

There is no key of the fenced node.

Comment 10 errata-xmlrpc 2022-05-10 14:50:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pcs bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:1978