Description of problem: Example failure in a lab: (undercloud) [stack@undercloud13 ~]$ openstack overcloud generate fencing --ipmi-lanplus --ipmi-level administrator --output fencing.yaml instack.json Action tripleo.parameters.generate_fencing execution failed: Failed to run action [action_ex_id=None, action_cls='<class 'mistral.actions.action_factory.GenerateFencingParametersAction'>', attributes='{}', params='{u'ipmi_level': u'administrator', u'ipmi_cipher': None, u'ipmi_lanplus': True, u'delay': None, u'os_auth': None, u'nodes_json': [{u'pm_password': u'redhat', u'name': u'overcloud13-node1', u'memory': u'8192', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:a5:a6:e0'], u'pm_port': u'634', u'pm_type': u'pxe_ipmitool', u'disk': u'42', u'arch': u'x86_64', u'cpu': u'2', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-node2', u'memory': u'8192', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:bb:f9:0f'], u'pm_port': u'635', u'pm_type': u'pxe_ipmitool', u'disk': u'42', u'arch': u'x86_64', u'cpu': u'2', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-node3', u'memory': u'8192', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:75:02:e7'], u'pm_port': u'636', u'pm_type': u'pxe_ipmitool', u'disk': u'42', u'arch': u'x86_64', u'cpu': u'2', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-node4', u'memory': u'8192', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:bc:d9:f7'], u'pm_port': u'637', u'pm_type': u'pxe_ipmitool', u'disk': u'42', u'arch': u'x86_64', u'cpu': u'2', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-node5', u'memory': u'8192', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:0f:4c:7c'], u'pm_port': u'638', u'pm_type': u'pxe_ipmitool', u'disk': u'42', u'arch': u'x86_64', u'cpu': u'2', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-node6', u'memory': u'8192', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:94:3b:fc'], u'pm_port': u'639', u'pm_type': u'pxe_ipmitool', u'disk': u'42', u'arch': u'x86_64', u'cpu': u'2', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-node7', u'memory': u'8192', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:84:e3:6b'], u'pm_port': u'640', u'pm_type': u'pxe_ipmitool', u'disk': u'42', u'arch': u'x86_64', u'cpu': u'2', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-node8', u'memory': u'8192', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:bc:b8:5c'], u'pm_port': u'641', u'pm_type': u'pxe_ipmitool', u'disk': u'42', u'arch': u'x86_64', u'cpu': u'2', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-ceph1', u'memory': u'4096', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:07:63:c8'], u'pm_port': u'642', u'pm_type': u'pxe_ipmitool', u'disk': u'20', u'arch': u'x86_64', u'cpu': u'1', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-ceph2', u'memory': u'4096', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:37:d5:00'], u'pm_port': u'643', u'pm_type': u'pxe_ipmitool', u'disk': u'20', u'arch': u'x86_64', u'cpu': u'1', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-ceph3', u'memory': u'4096', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:5d:46:76'], u'pm_port': u'644', u'pm_type': u'pxe_ipmitool', u'disk': u'20', u'arch': u'x86_64', u'cpu': u'1', u'pm_user': u'admin'}]}'] Not Found (HTTP 404) For this situation. Overcloud deployment failed during a scale up, example: 2020-06-26 17:53:55Z [overcloud]: UPDATE_FAILED Resource UPDATE failed: resources.Compute: Resource CREATE failed: ResourceInError: resources[2].resources.N ovaCompute: Went to status ERROR due to "Message: No valid host was found. There are not enough hosts available., Code: 500" Stack overcloud UPDATE_FAILED overcloud.Compute.2.NovaCompute: resource_type: OS::TripleO::ComputeServer physical_resource_id: cc1f99cc-ee9f-4240-8053-b7e134a059c8 status: CREATE_FAILED status_reason: | ResourceInError: resources.NovaCompute: Went to status ERROR due to "Message: No valid host was found. There are not enough hosts available., Code: 500" Heat Stack update failed. (undercloud) [stack@undercloud13 ~]$ nova list +--------------------------------------+-------------------------+--------+------------+-------------+-----------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------------------------+--------+------------+-------------+-----------------------+ | 099a1f89-0130-4174-9252-db6b7e748948 | overcloud-cephstorage-0 | ACTIVE | - | Running | ctlplane=172.16.6.110 | | 2255e1b9-407a-4bdc-8edc-56c072763c49 | overcloud-compute-0 | ACTIVE | - | Running | ctlplane=172.16.6.101 | | 10c1d7c1-416f-4a0f-b076-f15fc2376d60 | overcloud-compute-1 | ACTIVE | - | Running | ctlplane=172.16.6.102 | | cc1f99cc-ee9f-4240-8053-b7e134a059c8 | overcloud-compute-2 | ERROR | - | NOSTATE | | | 7824d837-b932-4e33-904c-54925b00d3d4 | overcloud-controller-0 | ACTIVE | - | Running | ctlplane=172.16.6.103 | +--------------------------------------+-------------------------+--------+------------+-------------+-----------------------+ No ironic node: (undercloud) [stack@undercloud13 ~]$ openstack baremetal node list |grep cc1f99cc-ee9f-4240-8053-b7e134a059c8 (nil) From mistral log: 2020-06-26 13:56:59.134 1702 DEBUG ironicclient.common.http [req-092a23aa-f897-4147-9fe5-2b0c203693c7 c03d8da0d553401ba7d64538c94d8bda b3d0c809dd634964b39491c769004753 - default default] curl -i -X GET -H 'X-OpenStack-Ironic-API-Version: 1.36' -H 'X-Auth-Token: {SHA1}e39ea52d4433f9a6fcbcb28de2c972ece6bca3d5' -H 'Content-Type: application/json' -H 'Accept: application/json' -H 'User-Agent: python-ironicclient' http://172.16.6.1:6385/v1/nodes/detail?instance_uuid=cc1f99cc-ee9f-4240-8053-b7e134a059c8 log_curl_request /usr/lib/python2.7/site-packages/ironicclient/common/http.py:337 2020-06-26 13:56:59.204 1702 DEBUG ironicclient.common.http [req-092a23aa-f897-4147-9fe5-2b0c203693c7 c03d8da0d553401ba7d64538c94d8bda b3d0c809dd634964b39491c769004753 - default default] HTTP/1.1 200 OK Date: Fri, 26 Jun 2020 17:56:59 GMT Server: Apache X-OpenStack-Ironic-API-Minimum-Version: 1.1 X-OpenStack-Ironic-API-Maximum-Version: 1.38 X-OpenStack-Ironic-API-Version: 1.36 Openstack-Request-Id: req-daa48895-2e6a-4ebd-9bc0-6702844f1289 Content-Length: 13 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: application/json {"nodes": []} log_http_response /usr/lib/python2.7/site-packages/ironicclient/common/http.py:351 2020-06-26 13:56:59.205 1702 ERROR mistral.executors.default_executor [req-092a23aa-f897-4147-9fe5-2b0c203693c7 c03d8da0d553401ba7d64538c94d8bda b3d0c809dd634964b39491c769004753 - default default] Failed to run action [action_ex_id=None, action_cls='<class 'mistral.actions.action_factory.GenerateFencingParametersAction'>', attributes='{}', params='{u'ipmi_level': u'administrator', u'ipmi_cipher': None, u'ipmi_lanplus': True, u'delay': None, u'os_auth': None, u'nodes_json': [{u'pm_password': u'redhat', u'name': u'overcloud13-node1', u'memory': u'8192', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:a5:a6:e0'], u'pm_port': u'634', u'pm_type': u'pxe_ipmitool', u'disk': u'42', u'arch': u'x86_64', u'cpu': u'2', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-node2', u'memory': u'8192', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:bb:f9:0f'], u'pm_port': u'635', u'pm_type': u'pxe_ipmitool', u'disk': u'42', u'arch': u'x86_64', u'cpu': u'2', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-node3', u'memory': u'8192', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:75:02:e7'], u'pm_port': u'636', u'pm_type': u'pxe_ipmitool', u'disk': u'42', u'arch': u'x86_64', u'cpu': u'2', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-node4', u'memory': u'8192', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:bc:d9:f7'], u'pm_port': u'637', u'pm_type': u'pxe_ipmitool', u'disk': u'42', u'arch': u'x86_64', u'cpu': u'2', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-node5', u'memory': u'8192', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:0f:4c:7c'], u'pm_port': u'638', u'pm_type': u'pxe_ipmitool', u'disk': u'42', u'arch': u'x86_64', u'cpu': u'2', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-node6', u'memory': u'8192', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:94:3b:fc'], u'pm_port': u'639', u'pm_type': u'pxe_ipmitool', u'disk': u'42', u'arch': u'x86_64', u'cpu': u'2', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-node7', u'memory': u'8192', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:84:e3:6b'], u'pm_port': u'640', u'pm_type': u'pxe_ipmitool', u'disk': u'42', u'arch': u'x86_64', u'cpu': u'2', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-node8', u'memory': u'8192', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:bc:b8:5c'], u'pm_port': u'641', u'pm_type': u'pxe_ipmitool', u'disk': u'42', u'arch': u'x86_64', u'cpu': u'2', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-ceph1', u'memory': u'4096', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:07:63:c8'], u'pm_port': u'642', u'pm_type': u'pxe_ipmitool', u'disk': u'20', u'arch': u'x86_64', u'cpu': u'1', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-ceph2', u'memory': u'4096', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:37:d5:00'], u'pm_port': u'643', u'pm_type': u'pxe_ipmitool', u'disk': u'20', u'arch': u'x86_64', u'cpu': u'1', u'pm_user': u'admin'}, {u'pm_password': u'redhat', u'name': u'overcloud13-ceph3', u'memory': u'4096', u'pm_addr': u'192.168.122.1', u'mac': [u'52:54:00:5d:46:76'], u'pm_port': u'644', u'pm_type': u'pxe_ipmitool', u'disk': u'20', u'arch': u'x86_64', u'cpu': u'1', u'pm_user': u'admin'}]}'] Not Found (HTTP 404): NotFound: Not Found (HTTP 404) 2020-06-26 13:56:59.205 1702 ERROR mistral.executors.default_executor Traceback (most recent call last): 2020-06-26 13:56:59.205 1702 ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/mistral/executors/default_executor.py", line 114, in run_action 2020-06-26 13:56:59.205 1702 ERROR mistral.executors.default_executor result = action.run(action_ctx) 2020-06-26 13:56:59.205 1702 ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/tripleo_common/actions/parameters.py", line 361, in run 2020-06-26 13:56:59.205 1702 ERROR mistral.executors.default_executor self.get_compute_client(context)) 2020-06-26 13:56:59.205 1702 ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/tripleo_common/utils/nodes.py", line 694, in generate_hostmap 2020-06-26 13:56:59.205 1702 ERROR mistral.executors.default_executor bm_node = baremetal_client.node.get_by_instance_uuid(node.id) 2020-06-26 13:56:59.205 1702 ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/ironicclient/v1/node.py", line 329, in get_by_instance_uuid 2020-06-26 13:56:59.205 1702 ERROR mistral.executors.default_executor raise exc.NotFound() 2020-06-26 13:56:59.205 1702 ERROR mistral.executors.default_executor NotFound: Not Found (HTTP 404) 2020-06-26 13:56:59.205 1702 ERROR mistral.executors.default_executor Version-Release number of selected component (if applicable): (undercloud) [stack@undercloud13 ~]$ rpm -q openstack-tripleo-common openstack-tripleo-common-8.7.1-20.el7ost.noarch (current) How reproducible: 100% Steps to Reproduce: 1. see example above 2. 3. It seems that warning about the missing node and skipping would be more correct in this situation
I think the exception should be handled in the generate_hostmap function: class GenerateFencingParametersAction(base.TripleOAction): """Generates fencing configuration for a deployment. ... def run(self, context): """Returns the parameters for fencing controller nodes""" hostmap = nodes.generate_hostmap(self.get_baremetal_client(context), self.get_compute_client(context)) fence_params = {"EnableFencing": True, "FencingConfig": {}} devices = [] ... def generate_hostmap(baremetal_client, compute_client): """Create a map between Compute nodes and Baremetal nodes""" hostmap = {} for node in compute_client.servers.list(): bm_node = baremetal_client.node.get_by_instance_uuid(node.id) for port in baremetal_client.port.list(node=bm_node.uuid): hostmap[port.address] = {"compute_name": node.name, "baremetal_name": bm_node.name} if hostmap == {}: return None else: return hostmap something like: def generate_hostmap(baremetal_client, compute_client): """Create a map between Compute nodes and Baremetal nodes""" hostmap = {} for node in compute_client.servers.list(): try: bm_node = baremetal_client.node.get_by_instance_uuid(node.id) for port in baremetal_client.port.list(node=bm_node.uuid): hostmap[port.address] = {"compute_name": node.name, "baremetal_name": bm_node.name} except: # we didn't find a bm_node corresponding to the instance # probably server is in error state with no corresponding # ironic node assigned. return can you maybe give https://review.opendev.org/#/c/738768/ a try?
Hi Luca, Thanks for looking at this. Your fix seems to work for me. (undercloud) [stack@undercloud13 ~]$ rm fencing.yaml (undercloud) [stack@undercloud13 ~]$ nova list +--------------------------------------+-------------------------+--------+------------+-------------+-----------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+-------------------------+--------+------------+-------------+-----------------------+ | 099a1f89-0130-4174-9252-db6b7e748948 | overcloud-cephstorage-0 | ACTIVE | - | Running | ctlplane=172.16.6.110 | | 2255e1b9-407a-4bdc-8edc-56c072763c49 | overcloud-compute-0 | ACTIVE | - | Running | ctlplane=172.16.6.101 | | 10c1d7c1-416f-4a0f-b076-f15fc2376d60 | overcloud-compute-1 | ACTIVE | - | Running | ctlplane=172.16.6.102 | | cc1f99cc-ee9f-4240-8053-b7e134a059c8 | overcloud-compute-2 | ERROR | - | NOSTATE | | | 7824d837-b932-4e33-904c-54925b00d3d4 | overcloud-controller-0 | ACTIVE | - | Running | ctlplane=172.16.6.103 | +--------------------------------------+-------------------------+--------+------------+-------------+-----------------------+ (undercloud) [stack@undercloud13 ~]$ openstack overcloud generate fencing --ipmi-lanplus --ipmi-level administrator --output fencing.yaml instack.json (nil) (undercloud) [stack@undercloud13 ~]$ cat fencing.yaml parameter_defaults: EnableFencing: true FencingConfig: devices: - agent: fence_ipmilan host_mac: 52:54:00:a5:a6:e0 params: ipaddr: 192.168.122.1 ipport: '634' lanplus: true login: admin passwd: redhat privlvl: administrator [...]
will fix this in train/osp16.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.1 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:4284