Bug 1485189
Summary: | [DDR][Docs][Director] Document that overcloud nodes need to trust the undercloud CA | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Alexander Chuzhoy <sasha> |
Component: | documentation | Assignee: | Martin Lopes <mlopes> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Dan Macpherson <dmacpher> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 12.0 (Pike) | CC: | asimonel, astupnik, augol, bschmaus, cchen, cjanisze, dbecker, dcadzow, dmacpher, dtrainor, gfidente, jcoufal, jjoyce, josorior, jpichon, jslagle, kbasil, mburns, mcornea, michele, morazi, nkinder, ohochman, pkesavar, pmannidi, rhel-osp-director-maint, rhos-docs, sathlang, srevivo, therve, yprokule |
Target Milestone: | --- | Keywords: | Regression, Triaged |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-10-15 06:14:07 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1638607 | ||
Bug Blocks: | 1615225 |
Description
Alexander Chuzhoy
2017-08-25 04:30:33 UTC
Could you please check overcloud controller node if there is "[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed" error same as here: https://bugs.launchpad.net/tripleo/+bug/1712836 If it is, fix for that issue is addressed by: https://review.openstack.org/#/c/496639/ [root@overcloud-controller-0 ~]# journalctl -u os-collect-config|grep -i CERTIFICATE_VERIFY_FAILED Aug 25 03:20:21 overcloud-controller-0 os-collect-config[3150]: requests.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:579) The fix applies to tripleo-quickstart-extras for the other environments the solution is to provide from an environment file the CAMap parameter populated as following: parameter_defaults: CAMap: overcloud-ca: overcloud_ca_pem_contents undercloud-ca: undercloud_ca_pem_contents Setting needinfo on Juan who can help us better and confirm/amend the above. I see the certificate validation fails while trying to reach undercloud Swift on the public endpoint: Aug 25 03:20:21 overcloud-controller-0 os-collect-config[3150]: [2017-08-25 03:20:21,363] (heat-config) [ERROR] [2017-08-25 03:20:21,312] (heat-config-notify) [DEBUG] Signaling to https://192.168.24.2:13808/v1/AUTH_60cd8093e4ad402c8a714f32d8bd83b1/create_admin-5d77912b-aaa4-4c4b-9379-fc449aab7d44/fb3187a9-e4bc-4507-a685-c8442c0e297f?temp_url_sig=bb02a6bc9a6d2a05ed8c2a106df9b1704c331f93&temp_url_expires=1503649197 via PUT I'm wondering why are we using the public endpoint and not the internal one which is http and I'd expect it to be used for inter node communication: (undercloud) [stack@undercloud-0 ~]$ openstack catalog show swift +-----------+-------------------------------------------------------------------------------+ | Field | Value | +-----------+-------------------------------------------------------------------------------+ | endpoints | regionOne | | | admin: http://192.168.24.3:8080 | | | regionOne | | | internal: http://192.168.24.3:8080/v1/AUTH_60cd8093e4ad402c8a714f32d8bd83b1 | | | regionOne | | | public: https://192.168.24.2:13808/v1/AUTH_60cd8093e4ad402c8a714f32d8bd83b1 | | | | | id | 61b8436abcfd41309ac1a205cb6a3f11 | | name | swift | | type | object-store | +-----------+-------------------------------------------------------------------------------+ Confirm that when overcloud+SSL deployed on undercloud without SSL - the issue doesn't reproduce. I'll update the upstream documentation to include this extra step (trusting the undercloud's certificate). And post the link to the docs review here. *** Bug 1486916 has been marked as a duplicate of this bug. *** I think the issue reproduced on BM latest puddle : openstack stack failures list overcloud --long overcloud.AllNodesDeploySteps.WorkflowTasks_Step2_Execution: resource_type: OS::Mistral::ExternalResource physical_resource_id: 75cf38af-9d79-47f0-88a8-b4c7d6b60979 status: CREATE_FAILED status_reason: | resources.WorkflowTasks_Step2_Execution: ERROR Mistral/engine.log: -------------------- Workflow 'tripleo.deployment.v1.deploy_on_server' [RUNNING -> ERROR, msg=Failed to run task [error=Can not evaluate YAQL expression [expression=task(deploy_config).result.deploy_stderr, error=Unknown function "#property#deploy_stderr", data={}], wf=tripleo.deployment.v1.deploy_on_server, task=send_message]: Heat-engine.log: ------------------ 2017-09-01 14:19:24.130 29148 DEBUG heat.engine.resources.openstack.mistral.external_resource [req-b78cbe81-d89b-46d0-a0d1-cc1dd04c3fd5 - admin - default default] Mistral execution 75cf38af-9d79-47f0-88a8-b4c7d6b60979 is in state ERROR _check_execution /usr/lib/python2.7/site-packages/heat/engine/resources/openstack/mistral/external_resource.py:159 2017-09-01 14:19:24.131 29148 INFO heat.engine.resource [req-b78cbe81-d89b-46d0-a0d1-cc1dd04c3fd5 - admin - default default] CREATE: MistralExternalResource "WorkflowTasks_Step2_Execution" [75cf38af-9d79-47f0-88a8-b4c7d6b60979] Stack "overcloud-AllNodesDeploySteps-f5fcphgjkrpl" [a32a5339-de2a-44ae-b5a2-b92e55c486de] 2017-09-01 14:19:24.131 29148 ERROR heat.engine.resource Traceback (most recent call last): 2017-09-01 14:19:24.131 29148 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 831, in _action_recorder 2017-09-01 14:19:24.131 29148 ERROR heat.engine.resource yield 2017-09-01 14:19:24.131 29148 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 939, in _do_action 2017-09-01 14:19:24.131 29148 ERROR heat.engine.resource yield self.action_handler_task(action, args=handler_args) 2017-09-01 14:19:24.131 29148 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/scheduler.py", line 351, in wrapper 2017-09-01 14:19:24.131 29148 ERROR heat.engine.resource step = next(subtask) 2017-09-01 14:19:24.131 29148 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 890, in action_handler_task 2017-09-01 14:19:24.131 29148 ERROR heat.engine.resource done = check(handler_data) 2017-09-01 14:19:24.131 29148 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resources/openstack/mistral/external_resource.py", line 251, in check_create_complete 2017-09-01 14:19:24.131 29148 ERROR heat.engine.resource return self._check_action(self.CREATE, execution_id) 2017-09-01 14:19:24.131 29148 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resources/openstack/mistral/external_resource.py", line 204, in _check_action 2017-09-01 14:19:24.131 29148 ERROR heat.engine.resource success, output = self._check_execution(action, execution_id) 2017-09-01 14:19:24.131 29148 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resources/openstack/mistral/external_resource.py", line 171, in _check_execution 2017-09-01 14:19:24.131 29148 ERROR heat.engine.resource action=action) 2017-09-01 14:19:24.131 29148 ERROR heat.engine.resource ResourceFailure: resources.WorkflowTasks_Step2_Execution: ERROR 2017-09-01 14:19:24.131 29148 ERROR heat.engine.resource After applying the recommended steps, I got Overcloud deploy SUCCESS moving to documentation Updating bug summary to more clearly capture the request for documentation update. Chris, Guilio, Chen, Omri, et al... do we have a solution that everyone agrees works now? If so, please identify a prime that I, and our documentation writer, can work with and we can get this defined and into our schedule. Thanks! Derek, I believe we are still discussing the right solution. I have opened another bug to track it from the product perspective: https://bugzilla.redhat.com/show_bug.cgi?id=1611807 *** Bug 1615189 has been marked as a duplicate of this bug. *** *** Bug 1501779 has been marked as a duplicate of this bug. *** Hi, so I got that for osp13 this is going to be documentation only. Please, make sure you're updating the Upgrade documentation as well as we got bz#1501779. For osp14 I understood it was automatic, so upgrade from osp13 to osp14 shouldn't be an issue here, right ? Thanks, Reviewing with Dan Looks good. Merged! Raised BZ requesting QE validation: https://bugzilla.redhat.com/show_bug.cgi?id=1638607 QE successfully tested procedure. |