Bug 1394418 - fence_compute: nova compute service is not unfenced
Summary: fence_compute: nova compute service is not unfenced
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: pacemaker   
(Show other bugs)
Version: 7.3
Hardware: Unspecified Unspecified
urgent
urgent
Target Milestone: rc
: 7.5
Assignee: Andrew Beekhof
QA Contact: Ofer Blaut
Steven J. Levine
URL:
Whiteboard:
Keywords: ZStream
Depends On:
Blocks: 1491544
TreeView+ depends on / blocked
 
Reported: 2016-11-11 22:29 UTC by Marian Krcmarik
Modified: 2018-04-10 15:29 UTC (History)
7 users (show)

Fixed In Version: pacemaker-1.1.18-1.el7
Doc Type: Release Note
Doc Text:
Pacemaker correctly implements fencing and unfencing for Pacemaker remote nodes Previously, Pacemaker did not implement unfencing for Pacemaker remote nodes. As a consequence, Pacemaker remote nodes remained fenced even if a fence device required unfencing. With this update, Pacemaker correctly implements both fencing and unfencing for Pacemaker remote nodes, and the described problem no longer occurs.
Story Points: ---
Clone Of:
: 1491544 (view as bug list)
Environment:
Last Closed: 2018-04-10 15:28:37 UTC
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2018:0860 None None None 2018-04-10 15:29 UTC

Description Marian Krcmarik 2016-11-11 22:29:36 UTC
Description of problem:
Based on discussion with Andrew I am creating a bug report which is related to fence_compute stonith device used within Instance HA of Openstack - nova compute service is not unfenced and remains in forced down status even though compute node and nova compute service on the node are up and running.

The definition of the stonith device:
 Resource: fence-nova (class=stonith type=fence_compute)
  Attributes: auth-url=http://10.0.0.103:5000/v2.0 login=admin passwd=qNJdaqZ7F4CHZVD37EGztCksd tenant-name=admin domain=localdomain record-only=1 no-shared-storage=False action=off
  Meta Attrs: provides=unfencing 
  Operations: monitor interval=60s (fence-nova-monitor-interval-60s)
 Node: compute-0
  Level 1 - my-stonith-xvm-compute-0,fence-nova
 Node: compute-1
  Level 1 - my-stonith-xvm-compute-1,fence-nova

Once a node is fenced nova-compute service is marked as down in service list of openstack services, but It's not unmarked as down when compute is back online and operational and then the compute cannot be used for scheduling insance unless it's manually unmarked as down

Version-Release number of selected component (if applicable):
$ rpm -qa | grep pacemaker
pacemaker-1.1.15-11.el7_3.2.x86_64
pacemaker-remote-1.1.15-11.el7_3.2.x86_64
fence-agents-compute-4.0.11-47.el7_3.1.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Configure fence_compute stonith device on Openstack cluster
2. Reset compute node
3. Wait for the compute node to be online

Actual results:
nova-compute service marked as down in "nova service-list"

Expected results:
nova-compute service marked as up in "nova service-list"

Additional info:

Comment 1 Andrew Beekhof 2016-11-14 03:41:29 UTC
Have you got some sosreports to go with this?
My installation is still misbehaving

Comment 3 Ken Gaillot 2017-01-16 23:46:50 UTC
This may already be fixed in the relevant agent, but leaving open until confirmed. Not needed for 7.4

Comment 6 Ken Gaillot 2017-09-13 16:28:03 UTC
Unfencing of Pacemaker Remote nodes is fixed in current upstream master branch

Comment 8 Ken Gaillot 2017-10-10 17:06:48 UTC
QA: Test procedure:

1. Configure a cluster with at least one cluster node and one Pacemaker Remote node, using fence_scsi as the fencing device.

2. Cause the Pacemaker Remote node to be fenced.

3. Before the fix, the remote node will not be unfenced; after the fix, it will.

Comment 10 Steven J. Levine 2017-12-07 18:20:20 UTC
Adding a title to doc text description for Release Note format

Comment 13 errata-xmlrpc 2018-04-10 15:28:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0860


Note You need to log in before you can comment on or make changes to this bug.