Bug 1762432

Summary: [OSP16] fence_compute: IHA is not working any longer due to a change in behaviour of nova's service disabling [rhel-8.1.0.z]
Product: Red Hat Enterprise Linux 8 Reporter: Oneata Mircea Teodor <toneata>
Component: fence-agentsAssignee: Oyvind Albrigtsen <oalbrigt>
Status: CLOSED ERRATA QA Contact: pkomarov
Severity: high Docs Contact:
Priority: high    
Version: 8.0CC: aherr, cfeist, cluster-maint, cluster-qe, dasmith, eglynn, jhakimra, kchamart, lyarwood, michele, oalbrigt, pkomarov, sbauza, sgordon, vromanso
Target Milestone: rcKeywords: ZStream
Target Release: 8.0Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: fence-agents-4.2.1-30.el8_1.1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1760213 Environment:
Last Closed: 2019-12-05 16:44:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1760213    
Bug Blocks:    

Comment 3 pkomarov 2019-11-05 12:25:32 UTC
Verified , 

the IHA process is working as excpected:

(undercloud) [stack@undercloud-0 ~]$ ansible computeInstanceHA -b -mshell -a'rpm -q fence-agents-compute'


overcloud-novacomputeiha-0 | CHANGED | rc=0 >>
fence-agents-compute-4.2.1-30.el8_1.1.noarch

overcloud-novacomputeiha-1 | CHANGED | rc=0 >>
fence-agents-compute-4.2.1-30.el8_1.1.noarch

(overcloud) [stack@undercloud-0 ~]$ openstack server show osvm2|grep hypervisor_hostnam
| OS-EXT-SRV-ATTR:hypervisor_hostname | overcloud-novacomputeiha-0.redhat.local    


(overcloud) [stack@undercloud-0 ~]$ ansible overcloud-novacomputeiha-0 -b -mshell -a'virsh list'
[WARNING]: Found both group and host with same name: undercloud

overcloud-novacomputeiha-0 | CHANGED | rc=0 >>
 Id    Name                           State
----------------------------------------------------
 1     instance-0000000b              running

(overcloud) [stack@undercloud-0 ~]$ ansible overcloud-novacomputeiha-0 -b -mshell -a'echo b >/proc/sysrq-trigger'
[WARNING]: Found both group and host with same name: undercloud

overcloud-novacomputeiha-0 | UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: Shared connection to 192.168.24.30 closed.",
    "unreachable": true
}


#from the logs 
Nov 05 12:15:58 controller-2 pacemaker-schedulerd[25359] (stage6) 	warning: Scheduling Node overcloud-novacomputeiha-0 for STONITH
Nov 05 12:15:58 controller-2 pacemaker-schedulerd[25359] (native_stop_constraints) 	info: compute-unfence-trigger:0_stop_0 is implicit after overcloud-novacomputeiha-0 is fenced
Nov 05 12:15:58 controller-2 pacemaker-schedulerd[25359] (LogNodeActions) 	notice:  * Fence (reboot) overcloud-novacomputeiha-0 'remote connection is unrecoverable'
Nov 05 12:15:58 controller-2 pacemaker-schedulerd[25359] (LogAction) 	notice:  * Recover    overcloud-novacomputeiha-0           (      

Nov 05 12:16:00 controller-2 pacemaker-fenced    [25356] (call_remote_stonith) 	info: Requesting that 'controller-0' perform op 'overcloud-novacomputeiha-0 off' with 'stonith-fence_ipmilan-5254000cae5a' for pacemaker-controld.25360 (72s)


Nov 05 12:16:05 controller-2 pacemaker-attrd     [25358] (attrd_peer_update) 	info: Setting evacuate[overcloud-novacomputeiha-0.redhat.local]: (null) -> yes from controller-0

(overcloud) [stack@undercloud-0 ~]$ ansible overcloud-novacomputeiha-0 -b -mshell -a'virsh list'
[WARNING]: Found both group and host with same name: undercloud

overcloud-novacomputeiha-0 | CHANGED | rc=0 >>
 Id    Name                           State
----------------------------------------------------

(overcloud) [stack@undercloud-0 ~]$ ansible overcloud-novacomputeiha-1 -b -mshell -a'virsh list'
[WARNING]: Found both group and host with same name: undercloud

overcloud-novacomputeiha-1 | CHANGED | rc=0 >>
 Id    Name                           State
----------------------------------------------------
 1     instance-0000000b              running

(overcloud) [stack@undercloud-0 ~]$ openstack server show osvm2|grep hypervisor_hostnam
| OS-EXT-SRV-ATTR:hypervisor_hostname | overcloud-novacomputeiha-1.redhat.local

Comment 5 errata-xmlrpc 2019-12-05 16:44:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:4112