Description of problem: When trying to deploy 3 controllers + 2 computes + 3 ceph, with network isolation and SSL, deployment fails near the end with: The action raised an exception [action_ex_id=9420ab57-0dba-4697-9fb9-d736b0f1f42a, action_cls='<class 'mistral.actions.action_factory.GetOvercloudConfig'>', attributes='{}', params='{u'container_config': u'overcloud-config', u'container': u'overcloud'}'] ERROR: None Version-Release number of selected component (if applicable): openstack-tripleo-common-9.4.1-0.20181012010886.el7ost.noarch openstack-tripleo-heat-templates-9.0.1-0.20181013060908.el7ost.noarch How reproducible: unknown Steps to Reproduce: 1. Deploy (OSP14 GA) with the above configuration Actual results: Here is an example stack trace that we find: {"message": "The resource could not be found.<br /><br />\n\n\n", "code": "404 Not Found", "title": "Not Found"} log_http_response /usr/lib/python2.7/site-packages/heatclient/common/http.py:157 2019-01-16 10:35:17.418 1 WARNING mistral.executors.default_executor [req-36f0f06d-5ce3-4de2-820c-8481b43409e7 6612b704f7a34876b5799e29298021bd 0254bbcb255d445083b5b16db74011e8 - default default] The action raised an exception [action_ex_id=f25ce5c2-61cf-4bc4-b817-006574c70802, action_cls='<class 'mistral.actions.action_factory.GetOvercloudConfig'>', attributes='{}', params='{u'container_config': u'overcloud-config', u'container': u'overcloud'}'] ERROR: None: HTTPNotFound: ERROR: None ERROR mistral.executors.default_executor Traceback (most recent call last): ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/mistral/executors/default_executor.py", line 114, in run_action ERROR mistral.executors.default_executor result = action.run(action_ctx) ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/tripleo_common/actions/config.py", line 87, in run ERROR mistral.executors.default_executor commit_message=message) ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/tripleo_common/utils/config.py", line 465, in download_config ERROR mistral.executors.default_executor self.write_config(stack, name, config_dir, config_type) ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/tripleo_common/utils/config.py", line 273, in write_config ERROR mistral.executors.default_executor config_dict = self.get_config_dict(deployment) ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/tripleo_common/utils/config.py", line 86, in get_config_dict ERROR mistral.executors.default_executor deployment_resource_id) ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/heatclient/v1/software_deployments.py", line 57, in get ERROR mistral.executors.default_executor resp = self.client.get('/software_deployments/%s' % deployment_id) ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/heatclient/common/http.py", line 289, in get ERROR mistral.executors.default_executor return self.client_request("GET", url, **kwargs) ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/heatclient/common/http.py", line 282, in client_request ERROR mistral.executors.default_executor resp, body = self.json_request(method, url, **kwargs) ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/heatclient/common/http.py", line 271, in json_request ERROR mistral.executors.default_executor resp = self._http_request(url, method, **kwargs) ERROR mistral.executors.default_executor File "/usr/lib/python2.7/site-packages/heatclient/common/http.py", line 234, in _http_request ERROR mistral.executors.default_executor raise exc.from_response(resp) ERROR mistral.executors.default_executor HTTPNotFound: ERROR: None ERROR mistral.executors.default_executor
So I had a look. I can see a few SoftwareDeployments with empty resource_id (nova_instance) and when fetching those with config download fails. MariaDB [heat]> select id,name,nova_instance, action, status, stack_id, created_at from resource where name like '%TripleoSoftwareDeployment%' and nova_instance is NULL; +------+---------------------------+---------------+--------+----------+--------------------------------------+---------------------+ | id | name | nova_instance | action | status | stack_id | created_at | +------+---------------------------+---------------+--------+----------+--------------------------------------+---------------------+ | 5641 | TripleOSoftwareDeployment | NULL | CREATE | COMPLETE | 8dacfd3a-aabe-42b7-a643-d7456ac91c62 | 2019-01-17 10:28:26 | | 5729 | TripleOSoftwareDeployment | NULL | CREATE | COMPLETE | 3d31025f-78d5-4b8e-b783-0b7a5e5530ad | 2019-01-17 10:31:54 | | 5763 | TripleOSoftwareDeployment | NULL | CREATE | COMPLETE | 017ca50a-c540-424b-817e-8f4df0269a5e | 2019-01-17 10:32:46 | +------+---------------------------+---------------+--------+----------+--------------------------------------+---------------------+ So the reason for that is that the plan environment has something like: .... CephStorageNetworkDeploymentActions: - '' ... So this should be an empty list (and not a list with empty string) for the default NetworkDeploymentActions to override it[1]. I could not find in the current user environments where it has been set. I guess the plan was created/updated from UI. So there may be bug in there or something wrong was selected. One way to try and fix this would probably be to set the below parameter in /home/stack/virt/node_data.yaml and then do an update, which will override the environment and use the default NetworkDeploymentActions CephStorageNetworkDeploymentActions: ['CREATE', 'UPDATE'] Note: you have to reset it to [] for the subsequent updates to not reapply the network configuration [1] https://github.com/openstack/tripleo-heat-templates/blob/stable/rocky/puppet/role.role.j2.yaml#L338-L339
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text. If this bug does not require doc text, please set the 'requires_doc_text' flag to -.
VERIFIED >> grep "2019-02-22.2" core_puddle_version 2019-02-22.2 >> openstack-tripleo-common-9.4.1-0.20181012010891.el7ost.noarch >>> cat virt/network/network-environment.yaml |grep vxlan NeutronNetworkType: vxlan NeutronTunnelTypes: vxlan nodes list >> +--------------------------------------+--------------+--------+------------+-------------+------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+--------------+--------+------------+-------------+------------------------+ | 54c7f9dc-0c60-4937-9a70-f491f9319fba | ceph-0 | ACTIVE | - | Running | ctlplane=192.168.24.10 | | d75bbfcc-8b19-4fce-8c4b-1339f1a35ba9 | ceph-1 | ACTIVE | - | Running | ctlplane=192.168.24.19 | | dacfad5d-7f40-483b-83b6-f9b685d4dc6d | ceph-2 | ACTIVE | - | Running | ctlplane=192.168.24.7 | | 4a7fc954-005e-488b-a13f-ac7a2c0decc2 | compute-0 | ACTIVE | - | Running | ctlplane=192.168.24.9 | | 463d43b4-92c7-4fc3-8498-e6fc1efd8fe7 | compute-1 | ACTIVE | - | Running | ctlplane=192.168.24.18 | | 426df787-af48-49d9-89fa-f4bb98f88fd0 | controller-0 | ACTIVE | - | Running | ctlplane=192.168.24.11 | | cc62269f-ef83-44f7-aa6f-fffdf3a33199 | controller-1 | ACTIVE | - | Running | ctlplane=192.168.24.12 | | 061cefe0-8026-43cb-b53f-6381ad3cb6f2 | controller-2 | ACTIVE | - | Running | ctlplane=192.168.24.6 | +--------------------------------------+--------------+--------+------------+-------------+------------------------+ (overcloud) [stack@undercloud-0 ~]$ openstack endpoint list |grep public | 02a21f5fe8a544bf86f132bb70dac4f3 | regionOne | placement | placement | True | public | https://10.0.0.101:13778/placement | | 097a69c9b8434fb1bb54440ef78ebbc3 | regionOne | aodh | alarming | True | public | https://10.0.0.101:13042 | | 1f1a1498f1654987a8eed73b3ffa1231 | regionOne | cinder | volume | True | public | https://10.0.0.101:13776/v1/%(tenant_id)s | | 26c0ae99af5446bc875347de1a7d4265 | regionOne | heat-cfn | cloudformation | True | public | https://10.0.0.101:13005/v1 | | 41f61d985212416c89906b5c58b28910 | regionOne | swift | object-store | True | public | https://10.0.0.101:13808/v1/AUTH_%(tenant_id)s | | 45b96c48af9e4faeb4e40eb6e1933df8 | regionOne | heat | orchestration | True | public | https://10.0.0.101:13004/v1/%(tenant_id)s | | 764996e62941416fb891ff84ea1b049b | regionOne | glance | image | True | public | https://10.0.0.101:13292 | | 7d8a57e28f8b4fb7b5abb71fcc866756 | regionOne | panko | event | True | public | https://10.0.0.101:13977 | | 82ee38351a91461dab9fff1e5b2e1305 | regionOne | gnocchi | metric | True | public | https://10.0.0.101:13041 | | 9140b5797c7041b99ae1bab2f2d244b7 | regionOne | keystone | identity | True | public | https://10.0.0.101:13000 | | a4738c2c17ba487387f71712169d59f0 | regionOne | cinderv2 | volumev2 | True | public | https://10.0.0.101:13776/v2/%(tenant_id)s | | c8437559aa1947d897b23ce41300c377 | regionOne | neutron | network | True | public | https://10.0.0.101:13696 | | ca461da3e82d4a98a073ff60211f4a81 | regionOne | nova | compute | True | public | https://10.0.0.101:13774/v2.1 | | f8ac618a6c414de0815380c4740bb8d0 | regionOne | cinderv3 | volumev3 | True | public | https://10.0.0.101:13776/v3/%(tenant_id)s | (overcloud) [stack@undercloud-0 ~]$ (undercloud) [stack@undercloud-0 ~]$ heat stack-list WARNING (shell) "heat stack-list" is deprecated, please use "openstack stack list" instead /usr/lib/python2.7/site-packages/urllib3/connection.py:344: SubjectAltNameWarning: Certificate for 192.168.24.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.) SubjectAltNameWarning /usr/lib/python2.7/site-packages/urllib3/connection.py:344: SubjectAltNameWarning: Certificate for 192.168.24.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.) SubjectAltNameWarning +--------------------------------------+------------+-----------------+----------------------+--------------+----------------------------------+ | id | stack_name | stack_status | creation_time | updated_time | project | +--------------------------------------+------------+-----------------+----------------------+--------------+----------------------------------+ | 5426ad54-993a-4a6c-96b6-ff956b390f19 | overcloud | CREATE_COMPLETE | 2019-02-28T10:21:23Z | None | c6b494157d1f484380e0df71539fdc49 | +--------------------------------------+------------+-----------------+----------------------+--------------+----------------------------------+ (undercloud) [stack@undercloud-0 ~]$ cat /etc/rh rhosp-release rhsm/ (undercloud) [stack@undercloud-0 ~]$ cat /etc/rhosp-release Red Hat OpenStack Platform release 14.0.1 RC (Rocky) (undercloud) [stack@undercloud-0 ~]$ cat core_puddle_version 2019-02-22.2(undercloud) [stack@undercloud-0 ~]$
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0446