Bug 1608558
Summary: | Deployment with ceph fails: FAILED! => {"changed": true, "cmd": "ANSIBLE_LOG_PATH=\"/var/lib/mistral/..../ceph-ansible/nodes_uuid_command.log\ | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Tzach Shefi <tshefi> | ||||||||
Component: | openstack-tripleo-heat-templates | Assignee: | Giulio Fidente <gfidente> | ||||||||
Status: | CLOSED DUPLICATE | QA Contact: | Yogev Rabl <yrabl> | ||||||||
Severity: | unspecified | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 14.0 (Rocky) | CC: | gfidente, mburns, seb, tshefi | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2018-09-05 11:50:22 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Created attachment 1470550 [details]
sosreport from undercloud
I think this is a dup of BZ #1601382 ; can you check what version of openstack-tripleo-heat-templates you are using? Created attachment 1470629 [details]
Controller sos
Yogev found an issue with my infrared job which initially kept backend as lvm on. However as it had failed (also), I later run overcloud_deploy.sh manually and that too still failed.
All the logs including initial ones which I uploaded were collected after I had ran overcloud_deploy.sh by hand a second time.
Sorry Giulio, missed you comment before uploaded that sosreport. This was my verion: openstack-tripleo-heat-templates-9.0.0-0.20180710202746.d2994ca.el7ost.noarch As fixed-in on dup BZ #1601382 is higher Guess I'll just wait or retry with a more recent build. Thanks Something is fishy here, just managed to deploy with internal ceph:1 Completed fine on two separate deployments, both were none HA as not enough host resources. The odd thing I can't explain is the THT version. This time I chose phase1, in hopes that it would bump up THT versions. Instead it looks as if it dump version down, resulting in an earlier THT version than phase2 provided, how could this be? Phase1 resulted in a working env, THT version: openstack-tripleo-heat-templates-9.0.0-0.20180703131156.de62fe3.el7ost.noarch While phase2 failed to deploy, THT version openstack-tripleo-heat-templates-9.0.0-0.20180710202746.d2994ca.el7ost.noarch I believe this is expected, the issue you are seeing is coming from an addition (the nodes-uuid playbook) which was not included in earlier builds. As a result, phase1 has a working build which misses completely a wanted feature; phase2 is testing a newer build which is affected by the bug. The bug affecting phase2 should be resolved in openstack-tripleo-heat-templates-9.0.0-0.20180717094149.d8b7b19.el7ost *** This bug has been marked as a duplicate of bug 1601382 *** |
Created attachment 1470549 [details] Overcloud_deploy.sh and some more files, inc ext ceph config on extra yaml Description of problem: Failing to deploy overcloud with ceph either internal or as in this case external ceph. I've hit this on three separate systems. Version-Release number of selected component (if applicable): How reproducible: Every time, with external or internal ceph same error. I've hit this on three deployments. Steps to Reproduce: 1. Deploy osp14 with ceph 2. 3. Actual results: Fails to deploy overcloud TASK [run nodes-uuid] ********************************************************** Wednesday 25 July 2018 14:51:37 -0400 (0:00:00.034) 0:08:43.396 ******** fatal: [undercloud]: FAILED! => {"changed": true, "cmd": "ANSIBLE_LOG_PATH=\"/var/lib/mistral/060fe3a1-0a55-436c-a63b-85e9a398e124/ceph-ansible/nodes_uuid_command.log\" ANSIBLE_SSH_RETRIES=3 ANSIBLE_HOST_KEY_CHECKING=False DEFAULT_FORKS=25 ansible-playbook --private-key /var/lib/mistral/060fe3a1-0a55-436c-a63b-85e9a398e124/ssh_private_key -i /var/lib/mistral/060fe3a1-0a55-436c-a63b-85e9a398e124/ceph-ansible/inventory.yml /var/lib/mistral/060fe3a1-0a55-436c-a63b-85e9a398e124/ceph-ansible/nodes_uuid_playbook.yml", "delta": "0:00:01.179175", "end": "2018-07-25 14:51:39.079278", "msg": "non-zero return code", "rc": 4, "start": "2018-07-25 14:51:37.900103", "stderr": "", "stderr_lines": [], "stdout": "\nPLAY [all] *********************************************************************\n\nTASK [set nodes data] **********************************************************\nWednesday 25 July 2018 14:51:38 -0400 (0:00:00.070) 0:00:00.070 ******** \nok: [controller-0]\nok: [compute-0]\n\nTASK [register machine id] *****************************************************\nWednesday 25 July 2018 14:51:38 -0400 (0:00:00.051) 0:00:00.122 ******** \nchanged: [compute-0]\nchanged: [controller-0]\n\nTASK [generate host vars from nodes data] **************************************\nWednesday 25 July 2018 14:51:38 -0400 (0:00:00.435) 0:00:00.558 ******** \nfatal: [controller-0]: UNREACHABLE! => {\"changed\": false, \"msg\": \"Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \\\"/tmp\\\". Failed command was: ( umask 77 && mkdir -p \\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1532544699.02-253372308047838 `\\\" && echo ansible-tmp-1532544699.02-253372308047838=\\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1532544699.02-253372308047838 `\\\" ), exited with result 1\", \"unreachable\": true}\nfatal: [compute-0]: UNREACHABLE! => {\"changed\": false, \"msg\": \"Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \\\"/tmp\\\". Failed command was: ( umask 77 && mkdir -p \\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1532544699.03-25964877819667 `\\\" && echo ansible-tmp-1532544699.03-25964877819667=\\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1532544699.03-25964877819667 `\\\" ), exited with result 1\", \"unreachable\": true}\n\nPLAY RECAP *********************************************************************\ncompute-0 : ok=2 changed=1 unreachable=1 failed=0 \ncontroller-0 : ok=2 changed=1 unreachable=1 failed=0 \n\nWednesday 25 July 2018 14:51:39 -0400 (0:00:00.049) 0:00:00.607 ******** \n=============================================================================== ", "stdout_lines": ["", "PLAY [all] *********************************************************************", "", "TASK [set nodes data] **********************************************************", "Wednesday 25 July 2018 14:51:38 -0400 (0:00:00.070) 0:00:00.070 ******** ", "ok: [controller-0]", "ok: [compute-0]", "", "TASK [register machine id] *****************************************************", "Wednesday 25 July 2018 14:51:38 -0400 (0:00:00.051) 0:00:00.122 ******** ", "changed: [compute-0]", "changed: [controller-0]", "", "TASK [generate host vars from nodes data] **************************************", "Wednesday 25 July 2018 14:51:38 -0400 (0:00:00.435) 0:00:00.558 ******** ", "fatal: [controller-0]: UNREACHABLE! => {\"changed\": false, \"msg\": \"Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \\\"/tmp\\\". Failed command was: ( umask 77 && mkdir -p \\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1532544699.02-253372308047838 `\\\" && echo ansible-tmp-1532544699.02-253372308047838=\\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1532544699.02-253372308047838 `\\\" ), exited with result 1\", \"unreachable\": true}", "fatal: [compute-0]: UNREACHABLE! => {\"changed\": false, \"msg\": \"Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \\\"/tmp\\\". Failed command was: ( umask 77 && mkdir -p \\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1532544699.03-25964877819667 `\\\" && echo ansible-tmp-1532544699.03-25964877819667=\\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1532544699.03-25964877819667 `\\\" ), exited with result 1\", \"unreachable\": true}", "", "PLAY RECAP *********************************************************************", "compute-0 : ok=2 changed=1 unreachable=1 failed=0 ", "controller-0 : ok=2 changed=1 unreachable=1 failed=0 ", "", "Wednesday 25 July 2018 14:51:39 -0400 (0:00:00.049) 0:00:00.607 ******** ", "=============================================================================== "]} NO MORE HOSTS LEFT ************************************************************* PLAY RECAP ********************************************************************* compute-0 : ok=107 changed=44 unreachable=0 failed=0 controller-0 : ok=144 changed=45 unreachable=0 failed=0 undercloud : ok=13 changed=7 unreachable=0 failed=1 Wednesday 25 July 2018 14:51:39 -0400 (0:00:01.365) 0:08:44.761 ******** =============================================================================== Ansible failed, check log at /var/lib/mistral/060fe3a1-0a55-436c-a63b-85e9a398e124/ansible.log. Overcloud configuration failed. (undercloud) [stack@undercloud-0 ~]$ Expected results: Should complete overcloud deployment. Additional info: