RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1285523 - Update fence-compute to include mark host down and taggable instance support
Summary: Update fence-compute to include mark host down and taggable instance support
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: fence-agents
Version: 7.3
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: ---
Assignee: Oyvind Albrigtsen
QA Contact: Asaf Hirshberg
URL:
Whiteboard:
: 1288312 (view as bug list)
Depends On:
Blocks: 1185030 1285524 1304329
TreeView+ depends on / blocked
 
Reported: 2015-11-25 20:18 UTC by Stephen Gordon
Modified: 2019-11-14 07:10 UTC (History)
4 users (show)

Fixed In Version: fence-agents-4.0.11-37.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1285524 1304329 (view as bug list)
Environment:
Last Closed: 2016-11-04 04:48:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
corosync log from controller-0 from the moment of fencing (157.35 KB, text/plain)
2016-06-07 05:51 UTC, Asaf Hirshberg
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:2373 0 normal SHIPPED_LIVE fence-agents bug fix update 2016-11-03 13:51:30 UTC

Description Stephen Gordon 2015-11-25 20:18:46 UTC
Description of problem:

In RHEL OpenStack Platform 8 we wish to include support in the fence compute (Nova) agent support for:

- The new OpenStack mark host down API call, where available, for telling NOva the host is down.
- Only evacuating instances that are marked evacuable in their associated image properties or flavor extra specifications.

Initial upstream work has been done:

https://github.com/ClusterLabs/fence-agents/commit/e3d7ccd652edb5d0bd60c210b029ce73c5fa27e9

https://github.com/ClusterLabs/fence-agents/commit/d9f2f483611253ece7fb097a9119de20a22d9111

...but an additional change is required to ensure *only* those instances that were marked evacuable are evacuated. Currently if *no* instances are found to be marked evacuable then *all* instances are considered evacuable which is not the behaviour we want out of the box (for those that do want all instances evacuated, tagging them as such is a simple matter).

Comment 8 Andrew Beekhof 2016-05-27 00:14:52 UTC
Additional patch: 

   https://github.com/beekhof/fence-agents/commit/06f592e

Comment 9 Andrew Beekhof 2016-05-27 03:03:22 UTC
Merged upstream:

   https://github.com/ClusterLabs/fence-agents/commit/e4599e4

7.2 packages available at:

   http://people.redhat.com/abeekhof/instance-ha/

Comment 10 Andrew Beekhof 2016-06-02 01:53:52 UTC
One last commit (to allow everything to be disabled):

   https://github.com/ClusterLabs/fence-agents/commit/cdbdd93


Marek: Could we get a build with these commits please?

Comment 11 Oyvind Albrigtsen 2016-06-02 11:51:28 UTC
New build with the last commits.

Comment 13 Asaf Hirshberg 2016-06-07 05:48:18 UTC
Failed using tagged image method, uploaded 2 cirros images and tagged one as "evacuable=true" then fenced one compute. corosync.log from the controller is attached.

[stack@puma33 ~]$ glance image-show ce2c9c9b-223f-4808-b0ce-cec536587b85
+------------------+----------------------------------------------------------------------------------+
| Property         | Value                                                                            |
+------------------+----------------------------------------------------------------------------------+
| checksum         | 50bdc35edb03a38d91b1b071afb20a3c                                                 |
| container_format | bare                                                                             |
| created_at       | 2016-06-07T05:15:18Z                                                             |
| direct_url       | rbd://3e95f1ce-2c66-11e6-a8b8-009c02b08fc8/images/ce2c9c9b-223f-4808-b0ce-       |
|                  | cec536587b85/snap                                                                |
| disk_format      | qcow2                                                                            |
| evacuable        | true                                                                             |
| id               | ce2c9c9b-223f-4808-b0ce-cec536587b85                                             |
| min_disk         | 0                                                                                |
| min_ram          | 0                                                                                |
| name             | cirros-tag                                                                       |
| owner            | 155fd4c559e842c98f037d9f3257e8c5                                                 |
| protected        | False                                                                            |
| size             | 9761280                                                                          |
| status           | active                                                                           |
| tags             | []                                                                               |
| updated_at       | 2016-06-07T05:16:03Z                                                             |
| virtual_size     | None                                                                             |
| visibility       | private                                                                          |
+------------------+----------------------------------------------------------------------------------+
[stack@puma33 ~]$ glance image-show bbb0b3d8-1dcf-4ab7-bd86-1139f213b7c7
+------------------+----------------------------------------------------------------------------------+
| Property         | Value                                                                            |
+------------------+----------------------------------------------------------------------------------+
| checksum         | 50bdc35edb03a38d91b1b071afb20a3c                                                 |
| container_format | bare                                                                             |
| created_at       | 2016-06-07T05:15:08Z                                                             |
| direct_url       | rbd://3e95f1ce-2c66-11e6-a8b8-009c02b08fc8/images/bbb0b3d8-1dcf-                 |
|                  | 4ab7-bd86-1139f213b7c7/snap                                                      |
| disk_format      | qcow2                                                                            |
| id               | bbb0b3d8-1dcf-4ab7-bd86-1139f213b7c7                                             |
| min_disk         | 0                                                                                |
| min_ram          | 0                                                                                |
| name             | cirros                                                                           |
| owner            | 155fd4c559e842c98f037d9f3257e8c5                                                 |
| protected        | False                                                                            |
| size             | 9761280                                                                          |
| status           | active                                                                           |
| tags             | []                                                                               |
| updated_at       | 2016-06-07T05:15:11Z                                                             |
| virtual_size     | None                                                                             |
| visibility       | private                                                                          |
+------------------+----------------------------------------------------------------------------------+
[stack@puma33 ~]$ 

from the controller:
[root@overcloud-controller-0 ~]# pcs stonith fence overcloud-novacompute-1
Node: overcloud-novacompute-1 fenced

[stack@puma33 ~]$ nova list --fields name,status,host
+--------------------------------------+------------+--------+-------------------------------------+
| ID                                   | Name       | Status | Host                                |
+--------------------------------------+------------+--------+-------------------------------------+
| 659e9c99-dedb-494b-a4b0-c81680ca2c22 | vm-regular | ACTIVE | overcloud-novacompute-0.localdomain |
| 595da4aa-5489-4eb7-99f2-eda275692423 | vm-tag     | ACTIVE | overcloud-novacompute-0.localdomain |
+--------------------------------------+------------+--------+-------------------------------------+

[stack@puma33 ~]$ nova list --fields name,status,host
+--------------------------------------+------------+--------+-------------------------------------+
| ID                                   | Name       | Status | Host                                |
+--------------------------------------+------------+--------+-------------------------------------+
| 659e9c99-dedb-494b-a4b0-c81680ca2c22 | vm-regular | ACTIVE | overcloud-novacompute-1.localdomain |
| 595da4aa-5489-4eb7-99f2-eda275692423 | vm-tag     | ACTIVE | overcloud-novacompute-1.localdomain |
+--------------------------------------+------------+--------+-------------------------------------+
[stack@puma33 ~]$

Comment 14 Asaf Hirshberg 2016-06-07 05:51:24 UTC
Created attachment 1165487 [details]
corosync log from controller-0 from the moment of fencing

Comment 15 Andrew Beekhof 2016-06-07 07:17:31 UTC
You say 'uploaded 2 cirros images and tagged one as "evacuable=true" ' but neither of the glance image-show commands indicate a tag is set.

Both have:

tags             | [] 

This is what it looks like for me:

[stack@undercloud ~]$ glance image-show c99f990c-f05f-41ae-ac8c-934c8fa3e377
+----------------------+--------------------------------------+
| Property             | Value                                |
+----------------------+--------------------------------------+
| Property 'evacuable' | true                                 |


Also, we'd need more than just corosync.log
Please do a sos report for all nodes.

Comment 16 Oyvind Albrigtsen 2016-06-07 09:30:10 UTC
New build with fix for indent issue:
https://github.com/ClusterLabs/fence-agents/pull/78

Comment 17 Asaf Hirshberg 2016-06-08 08:18:41 UTC
Andrew, I took the bug again as the version of fence-agents was older than the "fixed in version" so my comment-13 not really relevant.. 
but just for the record, i did tagged it and showed it in comment-13
[stack@puma33 ~]$ glance image-show ce2c9c9b-223f-4808-b0ce-cec536587b85
+------------------+----------------------------------------------------------------------------------+
| Property         | Value                                                                            |
+------------------+----------------------------------------------------------------------------------+
| checksum         | 50bdc35edb03a38d91b1b071afb20a3c                                                 |
| container_format | bare                                                                             |
| created_at       | 2016-06-07T05:15:18Z                                                             |
| direct_url       | rbd://3e95f1ce-2c66-11e6-a8b8-009c02b08fc8/images/ce2c9c9b-223f-4808-b0ce-       |
|                  | cec536587b85/snap                                                                |
| disk_format      | qcow2                                                                            |
| evacuable        | true  

                
Which of the following is needed in order to test the bug? after updating just fence-agent-common and fence-agent-scsi which was a dependency the evacuation haven't accrued after the fencing action:
fence-agents-all-4.0.11-36.el7.x86_64.rpm
fence-agents-amt-ws-4.0.11-36.el7.x86_64.rpm
fence-agents-apc-4.0.11-36.el7.x86_64.rpm
fence-agents-apc-snmp-4.0.11-36.el7.x86_64.rpm
fence-agents-bladecenter-4.0.11-36.el7.x86_64.rpm
fence-agents-brocade-4.0.11-36.el7.x86_64.rpm
fence-agents-cisco-mds-4.0.11-36.el7.x86_64.rpm
fence-agents-cisco-ucs-4.0.11-36.el7.x86_64.rpm
fence-agents-common-4.0.11-36.el7.x86_64.rpm
fence-agents-compute-4.0.11-36.el7.x86_64.rpm
fence-agents-drac5-4.0.11-36.el7.x86_64.rpm
fence-agents-eaton-snmp-4.0.11-36.el7.x86_64.rpm
fence-agents-emerson-4.0.11-36.el7.x86_64.rpm
fence-agents-eps-4.0.11-36.el7.x86_64.rpm
fence-agents-hpblade-4.0.11-36.el7.x86_64.rpm
fence-agents-ibmblade-4.0.11-36.el7.x86_64.rpm
fence-agents-ifmib-4.0.11-36.el7.x86_64.rpm
fence-agents-ilo2-4.0.11-36.el7.x86_64.rpm
fence-agents-ilo-moonshot-4.0.11-36.el7.x86_64.rpm
fence-agents-ilo-mp-4.0.11-36.el7.x86_64.rpm
fence-agents-ilo-ssh-4.0.11-36.el7.x86_64.rpm
fence-agents-intelmodular-4.0.11-36.el7.x86_64.rpm
fence-agents-ipdu-4.0.11-36.el7.x86_64.rpm
fence-agents-ipmilan-4.0.11-36.el7.x86_64.rpm
fence-agents-kdump-4.0.11-36.el7.x86_64.rpm
fence-agents-mpath-4.0.11-36.el7.x86_64.rpm
fence-agents-rhevm-4.0.11-36.el7.x86_64.rpm
fence-agents-rsa-4.0.11-36.el7.x86_64.rpm
fence-agents-rsb-4.0.11-36.el7.x86_64.rpm
fence-agents-scsi-4.0.11-36.el7.x86_64.rpm
fence-agents-virsh-4.0.11-36.el7.x86_64.rpm
fence-agents-vmware-soap-4.0.11-36.el7.x86_64.rpm
fence-agents-wti-4.0.11-36.el7.x86_64.rpm

Comment 18 Andrew Beekhof 2016-06-14 03:06:49 UTC
fence-agents-common-4.0.11-36.el7.x86_64.rpm
fence-agents-compute-4.0.11-36.el7.x86_64.rpm

but it will need the "tag and" fix we talked about last week

Comment 19 Andrew Beekhof 2016-06-14 10:46:27 UTC
Here's the final patch:

    https://github.com/ClusterLabs/fence-agents/commit/90dfc11

Oyvind: Can we get a new build please?

Comment 20 Oyvind Albrigtsen 2016-06-14 11:11:30 UTC
New build with the last patch.

Comment 22 Oyvind Albrigtsen 2016-06-15 11:46:06 UTC
*** Bug 1288312 has been marked as a duplicate of this bug. ***

Comment 23 Asaf Hirshberg 2016-06-16 04:57:50 UTC
Verified on RHEL-OSP director 9.0 puddle - 2016-06-03.1 using fence-agents-4.0.11-37.el7

[stack@puma33 ~]$ nova list --fields name,status,host
+--------------------------------------+---------+--------+-------------------------------------+
| ID                                   | Name    | Status | Host                                |
+--------------------------------------+---------+--------+-------------------------------------+
| 5f5c1db1-e9db-414b-a27e-5f046ea8a5fc | vm-TAG  | ACTIVE | overcloud-novacompute-0.localdomain |
| 0a571f41-c932-4486-a32f-da2ad7e56068 | vm-reg1 | ACTIVE | overcloud-novacompute-0.localdomain |
| 3a9f4957-1fa8-4fc8-a26f-1b5737dd513d | vm-reg2 | ACTIVE | overcloud-novacompute-1.localdomain |
+--------------------------------------+---------+--------+-------------------------------------+

*** Fencing compute-0 
[root@overcloud-controller-1 ~]# pcs stonith fence overcloud-novacompute-0

[stack@puma33 ~]$ nova list --fields name,status,host
+--------------------------------------+---------+--------+-------------------------------------+
| ID                                   | Name    | Status | Host                                |
+--------------------------------------+---------+--------+-------------------------------------+
| 5f5c1db1-e9db-414b-a27e-5f046ea8a5fc | vm-TAG  | ACTIVE | overcloud-novacompute-1.localdomain |
| 0a571f41-c932-4486-a32f-da2ad7e56068 | vm-reg1 | ACTIVE | overcloud-novacompute-0.localdomain |
| 3a9f4957-1fa8-4fc8-a26f-1b5737dd513d | vm-reg2 | ACTIVE | overcloud-novacompute-1.localdomain |
+--------------------------------------+---------+--------+-------------------------------------+
[stack@puma33 ~]$ 

*** Fencing compute-0 again
[root@overcloud-controller-1 ~]# pcs stonith fence overcloud-novacompute-0

+--------------------------------------+---------+--------+-------------------------------------+
| ID                                   | Name    | Status | Host                                |
+--------------------------------------+---------+--------+-------------------------------------+
| 5f5c1db1-e9db-414b-a27e-5f046ea8a5fc | vm-TAG  | ACTIVE | overcloud-novacompute-1.localdomain |
| 0a571f41-c932-4486-a32f-da2ad7e56068 | vm-reg1 | ACTIVE | overcloud-novacompute-0.localdomain |
| 3a9f4957-1fa8-4fc8-a26f-1b5737dd513d | vm-reg2 | ACTIVE | overcloud-novacompute-1.localdomain |
+--------------------------------------+---------+--------+-------------------------------------+

* deleting the tagged instance and the tagged image, fencing compute-0 again
[root@overcloud-controller-1 ~]# pcs stonith fence overcloud-novacompute-0

[stack@puma33 ~]$ nova list --fields name,status,host
+--------------------------------------+---------+--------+-------------------------------------+
| ID                                   | Name    | Status | Host                                |
+--------------------------------------+---------+--------+-------------------------------------+
| 0a571f41-c932-4486-a32f-da2ad7e56068 | vm-reg1 | ACTIVE | overcloud-novacompute-1.localdomain |
| 3a9f4957-1fa8-4fc8-a26f-1b5737dd513d | vm-reg2 | ACTIVE | overcloud-novacompute-1.localdomain |
+--------------------------------------+---------+--------+-------------------------------------+

Comment 25 errata-xmlrpc 2016-11-04 04:48:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2373.html


Note You need to log in before you can comment on or make changes to this bug.