Bug 1601382

Summary: Unreachable overcloud nodes during "run nodes-uuid" task
Product: Red Hat OpenStack Reporter: Filip Hubík <fhubik>
Component: openstack-tripleo-heat-templatesAssignee: Giulio Fidente <gfidente>
Status: CLOSED ERRATA QA Contact: Filip Hubík <fhubik>
Severity: high Docs Contact:
Priority: high    
Version: 14.0 (Rocky)CC: gfidente, johfulto, mburns, tshefi, tvignaud
Target Milestone: Upstream M3Keywords: Automation, AutomationBlocker, Triaged
Target Release: 14.0 (Rocky)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-9.0.0-0.20180717094149.d8b7b19.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-01-11 11:50:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
/var/lib/mistral/dab7ef10-b96d-44c4-a639-4270c8a6d019/ansible.log
none
var/lib/mistral/dab7ef10-b96d-44c4-a639-4270c8a6d019/ceph-ansible/nodes_uuid_command.log none

Description Filip Hubík 2018-07-16 08:59:19 UTC
Created attachment 1459086 [details]
/var/lib/mistral/dab7ef10-b96d-44c4-a639-4270c8a6d019/ansible.log

Description of problem:

Mistral deployment of ceph-ansible fails with errors:

TASK [run nodes-uuid] **********************************************************
Friday 13 July 2018  20:56:52 -0400 (0:00:00.039)       0:10:15.964 *********** 
fatal: [undercloud]: FAILED! => {"changed": true, "cmd": "ANSIBLE_LOG_PATH=\"/var/lib/mistral/dab7ef10-b96d-44c4-a639-4270c8a6d019/ceph-ansible/nodes_uuid_command.log\" ANSIBLE_SSH_RETRIES=3 ANSIBLE_HOST_KEY_CHECKING=False DEFAULT_FORKS=25 ansible-playbook --private-key /var/lib/mistral/dab7ef10-b96d-44c4-a639-4270c8a6d019/ssh_private_key -i /var/lib/mistral/dab7ef10-b96d-44c4-a639-4270c8a6d019/ceph-ansible/inventory.yml /var/lib/mistral/dab7ef10-b96d-44c4-a639-4270c8a6d019/ceph-ansible/nodes_uuid_playbook.yml", "delta": "0:00:01.307249", "end": "2018-07-13 20:56:54.254304", "msg": "non-zero return code", "rc": 4, "start": "2018-07-13 20:56:52.947055", "stderr": "", "stderr_lines": [], "stdout": "\nPLAY [all] *********************************************************************\n\nTASK [set nodes data] **********************************************************\nFriday 13 July 2018  20:56:53 -0400 (0:00:00.068)       0:00:00.069 *********** \nok: [compute-0]\nok: [ceph-0]\nok: [controller-0]\n\nTASK [register machine id] *****************************************************\nFriday 13 July 2018  20:56:53 -0400 (0:00:00.066)       0:00:00.135 *********** \nchanged: [ceph-0]\nchanged: [controller-0]\nchanged: [compute-0]\n\nTASK [generate host vars from nodes data] **************************************\nFriday 13 July 2018  20:56:54 -0400 (0:00:00.510)       0:00:00.645 *********** \nfatal: [compute-0]: UNREACHABLE! => {\"changed\": false, \"msg\": \"Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \\\"/tmp\\\". Failed command was: ( umask 77 && mkdir -p \\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1531529814.17-222746662763579 `\\\" && echo ansible-tmp-1531529814.17-222746662763579=\\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1531529814.17-222746662763579 `\\\" ), exited with result 1\", \"unreachable\": true}\nfatal: [ceph-0]: UNREACHABLE! => {\"changed\": false, \"msg\": \"Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \\\"/tmp\\\". Failed command was: ( umask 77 && mkdir -p \\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1531529814.18-201209661451782 `\\\" && echo ansible-tmp-1531529814.18-201209661451782=\\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1531529814.18-201209661451782 `\\\" ), exited with result 1\", \"unreachable\": true}\nfatal: [controller-0]: UNREACHABLE! => {\"changed\": false, \"msg\": \"Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \\\"/tmp\\\". Failed command was: ( umask 77 && mkdir -p \\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1531529814.19-31199768599421 `\\\" && echo ansible-tmp-1531529814.19-31199768599421=\\\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1531529814.19-31199768599421 `\\\" ), exited with result 1\", \"unreachable\": true}\n\nPLAY RECAP *********************************************************************\nceph-0                     : ok=2    changed=1    unreachable=1    failed=0   \ncompute-0                  : ok=2    changed=1    unreachable=1    failed=0   \ncontroller-0               : ok=2    changed=1    unreachable=1    failed=0   \n\nFriday 13 July 2018  20:56:54 -0400 (0:00:00.068)       0:00:00.713 ***********
...
NO MORE HOSTS LEFT *************************************************************
PLAY RECAP *********************************************************************
ceph-0                     : ok=89   changed=42   unreachable=0    failed=0   
compute-0                  : ok=107  changed=44   unreachable=0    failed=0   
controller-0               : ok=147  changed=45   unreachable=0    failed=0   
undercloud                 : ok=19   changed=10   unreachable=0    failed=1   
Friday 13 July 2018  20:56:54 -0400 (0:00:01.509)       0:10:17.474 *********** 
===============================================================================

also, {{playbook_dir}}/ceph-ansible/nodes_uuid_command.log reports:

TASK [generate host vars from nodes data] **************************************
Friday 13 July 2018  20:56:54 -0400 (0:00:00.510)       0:00:00.645 *********** 
fatal: [compute-0]: UNREACHABLE! => {"changed": false, "msg": "Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \"/tmp\". Failed command was: ( umask 77 && mkdir -p \"` echo /home/mistral/.ansible/tmp/ansible-tmp-1531529814.17-222746662763579 `\" && echo ansible-tmp-1531529814.17-222746662763579=\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1531529814.17-222746662763579 `\" ), exited with result 1", "unreachable": true}
fatal: [ceph-0]: UNREACHABLE! => {"changed": false, "msg": "Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \"/tmp\". Failed command was: ( umask 77 && mkdir -p \"` echo /home/mistral/.ansible/tmp/ansible-tmp-1531529814.18-201209661451782 `\" && echo ansible-tmp-1531529814.18-201209661451782=\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1531529814.18-201209661451782 `\" ), exited with result 1", "unreachable": true}
fatal: [controller-0]: UNREACHABLE! => {"changed": false, "msg": "Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \"/tmp\". Failed command was: ( umask 77 && mkdir -p \"` echo /home/mistral/.ansible/tmp/ansible-tmp-1531529814.19-31199768599421 `\" && echo ansible-tmp-1531529814.19-31199768599421=\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1531529814.19-31199768599421 `\" ), exited with result 1", "unreachable": true}
PLAY RECAP *********************************************************************
ceph-0                     : ok=2    changed=1    unreachable=1    failed=0   
compute-0                  : ok=2    changed=1    unreachable=1    failed=0   
controller-0               : ok=2    changed=1    unreachable=1    failed=0   
Friday 13 July 2018  20:56:54 -0400 (0:00:00.068)       0:00:00.713 *********** 
===============================================================================


Version-Release number of selected component (if applicable):
Puddle 2018-07-13.3

How reproducible:
100%

Steps to Reproduce:
1. Deploy OSPd14 using InfraRed virthost topology 1:1:1:1
2. Overcloud deployment fails with referenced error

Additional info:
It seems like issue with access to nodes https://github.com/openstack/tripleo-heat-templates/blob/master/docker/services/ceph-ansible/ceph-base.yaml#L418

Possibly related packages:
ceph-ansible.noarch 3.1.0-0.1.rc9.el7cp
openstack-tripleo-common.noarch 9.1.1-0.20180710151736.8e8dabd.el7ost

Comment 1 Filip Hubík 2018-07-16 09:00:07 UTC
Created attachment 1459088 [details]
var/lib/mistral/dab7ef10-b96d-44c4-a639-4270c8a6d019/ceph-ansible/nodes_uuid_command.log

Comment 7 Filip Hubík 2018-08-27 14:09:17 UTC
Overcloud deployment and "run nodes-uuid" task passed, tested on puddle 2018-07-13.3 with fixed openstack-tripleo-heat-templates-9.0.0-0.20180717094149.d8b7b19.el7ost.noarch .

Comment 8 Giulio Fidente 2018-09-05 11:50:22 UTC
*** Bug 1608558 has been marked as a duplicate of this bug. ***

Comment 12 errata-xmlrpc 2019-01-11 11:50:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:0045