Bug 1593345 - Undercloud is not reachable by mistral: "Authentication or permission failure"
Summary: Undercloud is not reachable by mistral: "Authentication or permission failure"
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-common
Version: 14.0 (Rocky)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: Upstream M3
: 14.0 (Rocky)
Assignee: Adriano Petrich
QA Contact: Marius Cornea
URL:
Whiteboard:
: 1594385 (view as bug list)
Depends On:
Blocks: 1596260
TreeView+ depends on / blocked
 
Reported: 2018-06-20 15:40 UTC by Filip Hubík
Modified: 2019-01-11 11:50 UTC (History)
9 users (show)

Fixed In Version: openstack-tripleo-common-9.1.1-0.20180623003933.5191b65.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1596260 (view as bug list)
Environment:
Last Closed: 2019-01-11 11:50:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
ansible.log for mistral step failed on UC (1.25 KB, text/plain)
2018-06-20 15:42 UTC, Filip Hubík
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github ansible ansible issues 41808 0 'None' closed regression with local temp locations in 2.5.4 2020-11-24 22:30:26 UTC
Launchpad 1778269 0 None None None 2018-06-22 19:53:36 UTC
OpenStack gerrit 577544 0 'None' MERGED Switch ansible tmp for local connections 2020-11-24 22:30:03 UTC
Red Hat Product Errata RHEA-2019:0045 0 None None None 2019-01-11 11:50:31 UTC

Description Filip Hubík 2018-06-20 15:40:28 UTC
Description of problem:

018-06-20 14:43:47Z [overcloud.AllNodesDeploySteps.ComputePostConfig]: CREATE_COMPLETE  state changed
2018-06-20 14:43:47Z [overcloud.AllNodesDeploySteps]: CREATE_COMPLETE  Stack CREATE completed successfully
2018-06-20 14:43:48Z [overcloud.AllNodesDeploySteps]: CREATE_COMPLETE  state changed
2018-06-20 14:43:48Z [overcloud]: CREATE_COMPLETE  Stack CREATE completed successfully

 Stack overcloud/daa56963-8ba4-49b9-8242-d6b5c74f2dc4 CREATE_COMPLETE 

Deploying overcloud configuration
Enabling ssh admin (tripleo-admin) for hosts:
192.168.24.18 192.168.24.11 192.168.24.15
Using ssh user heat-admin for initial connection.
Using ssh key at /home/stack/.ssh/id_rsa for initial connection.
Inserting TripleO short term key for 192.168.24.18
Inserting TripleO short term key for 192.168.24.11
Inserting TripleO short term key for 192.168.24.15
Starting ssh admin enablement workflow
ssh admin enablement workflow - RUNNING.
ssh admin enablement workflow - RUNNING.
ssh admin enablement workflow - RUNNING.
ssh admin enablement workflow - COMPLETE.
Removing TripleO short term key from 192.168.24.18
Removing TripleO short term key from 192.168.24.11
Removing TripleO short term key from 192.168.24.15
Removing short term keys locally
Enabling ssh admin - COMPLETE.
Config downloaded at /var/lib/mistral/9c5ad74e-3c88-4367-8502-f9f22fb86a49
Inventory generated at /var/lib/mistral/9c5ad74e-3c88-4367-8502-f9f22fb86a49/tripleo-ansible-inventory.yaml
Running ansible playbook at /var/lib/misOvercloud configuration failed.
tral/9c5ad74e-3c88-4367-8502-f9f22fb86a49/deploy_steps_playbook.yaml. See log file at /var/lib/mistral/9c5ad74e-3c88-4367-8502-f9f22fb86a49/ansible.log for progress. ...

Using /var/lib/mistral/9c5ad74e-3c88-4367-8502-f9f22fb86a49/ansible.cfg as config file

PLAY [Gather facts from undercloud] ********************************************

TASK [Gathering Facts] *********************************************************
fatal: [undercloud]: UNREACHABLE! => {"changed": false, "msg": "Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \"/tmp\". Failed command was: ( umask 77 && mkdir -p \"` echo /home/mistral/.ansible/tmp/ansible-tmp-1529505902.44-148383122247259 `\" && echo ansible-tmp-1529505902.44-148383122247259=\"` echo /home/mistral/.ansible/tmp/ansible-tmp-1529505902.44-148383122247259 `\" ), exited with result 1", "unreachable": true}

PLAY RECAP *********************************************************************
undercloud                 : ok=0    changed=0    unreachable=1    failed=0


Version-Release number of selected component (if applicable):
OSPd14

How reproducible:
always

Steps to Reproduce:
1. Deploy any OSPd14 topology using InfraRed and puddle 2018-06-19.4

Actual results:
Overcloud deploy stage fails with mentioned error

Additional info:
overcloud stack is created successfully, post-deployment mistral step fails

Comment 1 Filip Hubík 2018-06-20 15:42:52 UTC
Created attachment 1453236 [details]
ansible.log for mistral step failed on UC

/var/lib/mistral/XYZ/ansible.log

Comment 3 Alex Schultz 2018-06-20 20:34:35 UTC
I was able to recreate this, it seems to only happen when mistral runs the config download items. When I manually ran the ansible playbook script in /var/lib/mistral/<uuid>/ after the fact as root it ran fine.

James have you seen this one before?

Comment 4 Pavel Sedlák 2018-06-21 07:47:54 UTC
Mistral user, as which i believe var/lib/mistral/<uuid>/ansible-playbook-command.sh gets executed as, does not have home folder created.

in passwd there is mistral:x:988:985:Mistral Daemons:/home/mistral:/sbin/nologin
but /home/mistral does not exists

(that's why also running as root works, as root's home and so ansible tmp path exists/can be created)


mkdir /home/mistral; chown mistral:mistral /home/mistral
enables the playbook to pass the undercloud fact gathering point

Comment 5 Alex Schultz 2018-06-21 16:09:22 UTC
That would seem to be a packaging issue with mistral, though we haven't seen this issue upstream which makes me wonder why we only hit this downstream.

Comment 7 Alex Schultz 2018-06-21 16:42:01 UTC
It seems that we're using ansible-2.5.4 while upstream we use 2.5.2. The connection it's failing on is supposed to be localhost so it's not supposed to be using ssh. It's likely that there's an issue our ansible cfg around this.

Comment 8 Alex Schultz 2018-06-21 17:15:03 UTC
I think I've tracked this down to likely https://github.com/ansible/ansible/commit/864fd7c53e45703554bb6de608fe13a2200b6aa0

It appears that the local connection temp pathing has changed in ansible-2.5.4. Trying to figure out how we can work around this without setting remote_tmp because that would have other impacts

Comment 9 Alex Schultz 2018-06-21 18:27:36 UTC
Raised the issue with ansible. Current workaround is to downgrade to ansible 2.5.2. I've confirmed it is an issue with 2.5.4+ but should work in 2.4.2

Comment 10 Filip Hubík 2018-06-22 09:56:42 UTC
I can confirm that home for mistral user manually created before overcloud deployment workarounds this specific issue.

Comment 11 Alex Schultz 2018-06-22 21:00:58 UTC
*** Bug 1594385 has been marked as a duplicate of this bug. ***

Comment 16 Adriano Petrich 2018-11-12 10:14:49 UTC
Done

Comment 18 errata-xmlrpc 2019-01-11 11:50:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:0045


Note You need to log in before you can comment on or make changes to this bug.