Description of problem:
Based on discussion with Andrew I am creating a bug report which is related to fence_compute stonith device used within Instance HA of Openstack - nova compute service is not unfenced and remains in forced down status even though compute node and nova compute service on the node are up and running.
The definition of the stonith device:
Resource: fence-nova (class=stonith type=fence_compute)
Attributes: auth-url=http://10.0.0.103:5000/v2.0 login=admin passwd=qNJdaqZ7F4CHZVD37EGztCksd tenant-name=admin domain=localdomain record-only=1 no-shared-storage=False action=off
Meta Attrs: provides=unfencing
Operations: monitor interval=60s (fence-nova-monitor-interval-60s)
Level 1 - my-stonith-xvm-compute-0,fence-nova
Level 1 - my-stonith-xvm-compute-1,fence-nova
Once a node is fenced nova-compute service is marked as down in service list of openstack services, but It's not unmarked as down when compute is back online and operational and then the compute cannot be used for scheduling insance unless it's manually unmarked as down
Version-Release number of selected component (if applicable):
$ rpm -qa | grep pacemaker
Steps to Reproduce:
1. Configure fence_compute stonith device on Openstack cluster
2. Reset compute node
3. Wait for the compute node to be online
nova-compute service marked as down in "nova service-list"
nova-compute service marked as up in "nova service-list"
Have you got some sosreports to go with this?
My installation is still misbehaving
This may already be fixed in the relevant agent, but leaving open until confirmed. Not needed for 7.4
Unfencing of Pacemaker Remote nodes is fixed in current upstream master branch
QA: Test procedure:
1. Configure a cluster with at least one cluster node and one Pacemaker Remote node, using fence_scsi as the fencing device.
2. Cause the Pacemaker Remote node to be fenced.
3. Before the fix, the remote node will not be unfenced; after the fix, it will.
Adding a title to doc text description for Release Note format
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.