Bug 1608558 - Deployment with ceph fails: FAILED! => {"changed": true, "cmd": "ANSIBLE_LOG_PATH=\"/var/lib/mistral/..../ceph-ansible/nodes_uuid_command.log\
Summary: Deployment with ceph fails: FAILED! => {"changed": true, "cmd": "ANSIBLE_LOG_...
Keywords:
Status: CLOSED DUPLICATE of bug 1601382
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 14.0 (Rocky)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Giulio Fidente
QA Contact: Yogev Rabl
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-25 19:50 UTC by Tzach Shefi
Modified: 2018-09-05 11:50 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-09-05 11:50:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Overcloud_deploy.sh and some more files, inc ext ceph config on extra yaml (142.81 KB, application/zip)
2018-07-25 19:50 UTC, Tzach Shefi
no flags Details
sosreport from undercloud (14.26 MB, application/x-xz)
2018-07-25 19:54 UTC, Tzach Shefi
no flags Details
Controller sos (9.79 MB, application/x-xz)
2018-07-26 07:35 UTC, Tzach Shefi
no flags Details

Description Tzach Shefi 2018-07-25 19:50:42 UTC
Created attachment 1470549 [details]
Overcloud_deploy.sh and some more files, inc ext ceph config on extra yaml

Description of problem: Failing to deploy overcloud with ceph either internal or as in this case external ceph. I've hit this on three separate systems. 


Version-Release number of selected component (if applicable):


How reproducible:
Every time, with external or internal ceph same error.
I've hit this on three deployments. 

Steps to Reproduce:
1. Deploy osp14 with ceph
2.
3.

Actual results:
Fails to deploy overcloud


TASK [run nodes-uuid] **********************************************************
Wednesday 25 July 2018  14:51:37 -0400 (0:00:00.034)       0:08:43.396 ********
fatal: [undercloud]: FAILED! => {"changed": true, "cmd": "ANSIBLE_LOG_PATH=\"/var/lib/mistral/060fe3a1-0a55-436c-a63b-85e9a398e124/ceph-ansible/nodes_uuid_command.log\" ANSIBLE_SSH_RETRIES=3 ANSIBLE_HOST_KEY_CHECKING=False DEFAULT_FORKS=25 ansible-playbook --private-key /var/lib/mistral/060fe3a1-0a55-436c-a63b-85e9a398e124/ssh_private_key -i /var/lib/mistral/060fe3a1-0a55-436c-a63b-85e9a398e124/ceph-ansible/inventory.yml /var/lib/mistral/060fe3a1-0a55-436c-a63b-85e9a398e124/ceph-ansible/nodes_uuid_playbook.yml", "delta": "0:00:01.179175", "end": "2018-07-25 14:51:39.079278", "msg": "non-zero return code", "rc": 4, "start": "2018-07-25 14:51:37.900103", "stderr": "", "stderr_lines": [], "stdout": "\nPLAY [all] *********************************************************************\n\nTASK [set nodes data] **********************************************************\nWednesday 25 July 2018  14:51:38 -0400 (0:00:00.070)       0:00:00.070 ******** \nok: [controller-0]\nok: [compute-0]\n\nTASK [register machine id] *****************************************************\nWednesday 25 July 2018  14:51:38 -0400 (0:00:00.051)       0:00:00.122 ******** \nchanged: [compute-0]\nchanged: [controller-0]\n\nTASK [generate host vars from nodes data] **************************************\nWednesday 25 July 2018  14:51:38 -0400 (0:00:00.435)       0:00:00.558 ******** \nfatal: [controller-0]: UNREACHABLE! => {\"changed\": false, \"msg\": \"Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \\\"/tmp\\\". Failed command was: ( umask 77 && mkdir -p \\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1532544699.02-253372308047838 `\\\" && echo ansible-tmp-1532544699.02-253372308047838=\\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1532544699.02-253372308047838 `\\\" ), exited with result 1\", \"unreachable\": true}\nfatal: [compute-0]: UNREACHABLE! => {\"changed\": false, \"msg\": \"Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \\\"/tmp\\\". Failed command was: ( umask 77 && mkdir -p \\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1532544699.03-25964877819667 `\\\" && echo ansible-tmp-1532544699.03-25964877819667=\\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1532544699.03-25964877819667 `\\\" ), exited with result 1\", \"unreachable\": true}\n\nPLAY RECAP *********************************************************************\ncompute-0                  : ok=2    changed=1    unreachable=1    failed=0   \ncontroller-0               : ok=2    changed=1    unreachable=1    failed=0   \n\nWednesday 25 July 2018  14:51:39 -0400 (0:00:00.049)       0:00:00.607 ******** \n=============================================================================== ", "stdout_lines": ["", "PLAY [all] *********************************************************************", "", "TASK [set nodes data] **********************************************************", "Wednesday 25 July 2018  14:51:38 -0400 (0:00:00.070)       0:00:00.070 ******** ", "ok: [controller-0]", "ok: [compute-0]", "", "TASK [register machine id] *****************************************************", "Wednesday 25 July 2018  14:51:38 -0400 (0:00:00.051)       0:00:00.122 ******** ", "changed: [compute-0]", "changed: [controller-0]", "", "TASK [generate host vars from nodes data] **************************************", "Wednesday 25 July 2018  14:51:38 -0400 (0:00:00.435)       0:00:00.558 ******** ", "fatal: [controller-0]: UNREACHABLE! => {\"changed\": false, \"msg\": \"Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \\\"/tmp\\\". Failed command was: ( umask 77 && mkdir -p \\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1532544699.02-253372308047838 `\\\" && echo ansible-tmp-1532544699.02-253372308047838=\\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1532544699.02-253372308047838 `\\\" ), exited with result 1\", \"unreachable\": true}", "fatal: [compute-0]: UNREACHABLE! => {\"changed\": false, \"msg\": \"Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \\\"/tmp\\\". Failed command was: ( umask 77 && mkdir -p \\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1532544699.03-25964877819667 `\\\" && echo ansible-tmp-1532544699.03-25964877819667=\\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1532544699.03-25964877819667 `\\\" ), exited with result 1\", \"unreachable\": true}", "", "PLAY RECAP *********************************************************************", "compute-0                  : ok=2    changed=1    unreachable=1    failed=0   ", "controller-0               : ok=2    changed=1    unreachable=1    failed=0   ", "", "Wednesday 25 July 2018  14:51:39 -0400 (0:00:00.049)       0:00:00.607 ******** ", "=============================================================================== "]}

NO MORE HOSTS LEFT *************************************************************

PLAY RECAP *********************************************************************
compute-0                  : ok=107  changed=44   unreachable=0    failed=0
controller-0               : ok=144  changed=45   unreachable=0    failed=0
undercloud                 : ok=13   changed=7    unreachable=0    failed=1

Wednesday 25 July 2018  14:51:39 -0400 (0:00:01.365)       0:08:44.761 ********
===============================================================================

Ansible failed, check log at /var/lib/mistral/060fe3a1-0a55-436c-a63b-85e9a398e124/ansible.log.
Overcloud configuration failed.
(undercloud) [stack@undercloud-0 ~]$ 

Expected results:
Should complete overcloud deployment. 

Additional info:

Comment 1 Tzach Shefi 2018-07-25 19:54:52 UTC
Created attachment 1470550 [details]
sosreport from undercloud

Comment 2 Giulio Fidente 2018-07-25 21:35:09 UTC
I think this is a dup of BZ #1601382 ; can you check what version of openstack-tripleo-heat-templates you are using?

Comment 3 Tzach Shefi 2018-07-26 07:35:08 UTC
Created attachment 1470629 [details]
Controller sos

Yogev found an issue with my infrared job which initially kept backend as lvm on. However as it had failed (also), I later run overcloud_deploy.sh manually and that too still failed.

All the logs including initial ones which I uploaded were collected after I had ran overcloud_deploy.sh by hand a second time.

Comment 5 Tzach Shefi 2018-07-26 07:53:43 UTC
Sorry Giulio, missed you comment before uploaded that sosreport.  

This was my verion: 
openstack-tripleo-heat-templates-9.0.0-0.20180710202746.d2994ca.el7ost.noarch

As fixed-in on dup  BZ #1601382 is higher
Guess I'll just wait or retry with a more recent build.

Thanks

Comment 6 Tzach Shefi 2018-07-26 10:54:04 UTC
Something is fishy here, just managed to deploy with internal ceph:1 
Completed fine on two separate deployments, both were none HA as not enough host resources. 

The odd thing I can't explain is the THT version.
This time I chose phase1, in hopes that it would bump up THT versions. 
Instead it looks as if it dump version down, resulting in an earlier THT version than phase2 provided, how could this be?

Phase1 resulted in a working env, THT version:
openstack-tripleo-heat-templates-9.0.0-0.20180703131156.de62fe3.el7ost.noarch

While phase2 failed to deploy, THT version
openstack-tripleo-heat-templates-9.0.0-0.20180710202746.d2994ca.el7ost.noarch

Comment 7 Giulio Fidente 2018-07-26 11:02:15 UTC
I believe this is expected, the issue you are seeing is coming from an addition (the nodes-uuid playbook) which was not included in earlier builds.

As a result, phase1 has a working build which misses completely a wanted feature; phase2 is testing a newer build which is affected by the bug.

The bug affecting phase2 should be resolved in openstack-tripleo-heat-templates-9.0.0-0.20180717094149.d8b7b19.el7ost

Comment 10 Giulio Fidente 2018-09-05 11:50:22 UTC

*** This bug has been marked as a duplicate of bug 1601382 ***


Note You need to log in before you can comment on or make changes to this bug.