RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1394418 - fence_compute: nova compute service is not unfenced
Summary: fence_compute: nova compute service is not unfenced
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: pacemaker
Version: 7.3
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: 7.5
Assignee: Andrew Beekhof
QA Contact: Ofer Blaut
Steven J. Levine
URL:
Whiteboard:
Depends On:
Blocks: 1491544
TreeView+ depends on / blocked
 
Reported: 2016-11-11 22:29 UTC by Marian Krcmarik
Modified: 2018-04-10 15:29 UTC (History)
7 users (show)

Fixed In Version: pacemaker-1.1.18-1.el7
Doc Type: Release Note
Doc Text:
Pacemaker correctly implements fencing and unfencing for Pacemaker remote nodes Previously, Pacemaker did not implement unfencing for Pacemaker remote nodes. As a consequence, Pacemaker remote nodes remained fenced even if a fence device required unfencing. With this update, Pacemaker correctly implements both fencing and unfencing for Pacemaker remote nodes, and the described problem no longer occurs.
Clone Of:
: 1491544 (view as bug list)
Environment:
Last Closed: 2018-04-10 15:28:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2018:0860 0 None None None 2018-04-10 15:29:53 UTC

Description Marian Krcmarik 2016-11-11 22:29:36 UTC
Description of problem:
Based on discussion with Andrew I am creating a bug report which is related to fence_compute stonith device used within Instance HA of Openstack - nova compute service is not unfenced and remains in forced down status even though compute node and nova compute service on the node are up and running.

The definition of the stonith device:
 Resource: fence-nova (class=stonith type=fence_compute)
  Attributes: auth-url=http://10.0.0.103:5000/v2.0 login=admin passwd=qNJdaqZ7F4CHZVD37EGztCksd tenant-name=admin domain=localdomain record-only=1 no-shared-storage=False action=off
  Meta Attrs: provides=unfencing 
  Operations: monitor interval=60s (fence-nova-monitor-interval-60s)
 Node: compute-0
  Level 1 - my-stonith-xvm-compute-0,fence-nova
 Node: compute-1
  Level 1 - my-stonith-xvm-compute-1,fence-nova

Once a node is fenced nova-compute service is marked as down in service list of openstack services, but It's not unmarked as down when compute is back online and operational and then the compute cannot be used for scheduling insance unless it's manually unmarked as down

Version-Release number of selected component (if applicable):
$ rpm -qa | grep pacemaker
pacemaker-1.1.15-11.el7_3.2.x86_64
pacemaker-remote-1.1.15-11.el7_3.2.x86_64
fence-agents-compute-4.0.11-47.el7_3.1.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Configure fence_compute stonith device on Openstack cluster
2. Reset compute node
3. Wait for the compute node to be online

Actual results:
nova-compute service marked as down in "nova service-list"

Expected results:
nova-compute service marked as up in "nova service-list"

Additional info:

Comment 1 Andrew Beekhof 2016-11-14 03:41:29 UTC
Have you got some sosreports to go with this?
My installation is still misbehaving

Comment 3 Ken Gaillot 2017-01-16 23:46:50 UTC
This may already be fixed in the relevant agent, but leaving open until confirmed. Not needed for 7.4

Comment 6 Ken Gaillot 2017-09-13 16:28:03 UTC
Unfencing of Pacemaker Remote nodes is fixed in current upstream master branch

Comment 8 Ken Gaillot 2017-10-10 17:06:48 UTC
QA: Test procedure:

1. Configure a cluster with at least one cluster node and one Pacemaker Remote node, using fence_scsi as the fencing device.

2. Cause the Pacemaker Remote node to be fenced.

3. Before the fix, the remote node will not be unfenced; after the fix, it will.

Comment 10 Steven J. Levine 2017-12-07 18:20:20 UTC
Adding a title to doc text description for Release Note format

Comment 13 errata-xmlrpc 2018-04-10 15:28:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0860


Note You need to log in before you can comment on or make changes to this bug.