Bug 1644702 - [UPGRADES][14] Converge failed during ' ceph-config : generate ceph.conf configuration file
Summary: [UPGRADES][14] Converge failed during ' ceph-config : generate ceph.conf conf...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-tripleoclient
Version: 14.0 (Rocky)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: beta
: 14.0 (Rocky)
Assignee: Jiri Stransky
QA Contact: Yurii Prokulevych
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-31 12:41 UTC by Yurii Prokulevych
Modified: 2023-02-22 23:02 UTC (History)
8 users (show)

Fixed In Version: python-tripleoclient-10.6.1-0.20181010222405.8c8f259.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-01-11 11:54:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1801066 0 None None None 2018-11-01 12:20:26 UTC
OpenStack gerrit 615829 0 None MERGED Always run upgrades/updates as tripleo-admin 2020-04-09 08:01:20 UTC
Red Hat Product Errata RHEA-2019:0045 0 None None None 2019-01-11 11:54:40 UTC

Description Yurii Prokulevych 2018-10-31 12:41:58 UTC
Description of problem:
-----------------------
Overcloud upgrade converge failed while running configuration for ceph:

openstack overcloud  upgrade converge \
    --templates /usr/share/openstack-tripleo-heat-templates \
    --stack qe-Cloud-0 \
    -e /home/stack/virt/internal.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation-v6.yaml \
    -e /home/stack/virt/network/network-environment-v6.yaml \
    -e /home/stack/virt/enable-tls.yaml \
    -e /home/stack/virt/inject-trust-anchor.yaml \
    -e /home/stack/virt/public_vip.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-ip.yaml \
    -e /home/stack/virt/hostnames.yml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
    -e /home/stack/virt/debug.yaml \
    -e /home/stack/virt/nodes_data.yaml \
    -e /home/stack/cli_opts_params.yaml \
    -e /home/stack/containers-prepare-parameter.yaml \
    --roles-file /usr/share/openstack-tripleo-heat-templates/roles_data.yaml 2>&1
...
        "TASK [ceph-config : generate ceph.conf configuration file] *********************", 
        "task path: /usr/share/ceph-ansible/roles/ceph-config/tasks/main.yml:84", 
        "Wednesday 31 October 2018  07:15:35 -0400 (0:00:00.306)       0:00:53.352 ***** ", 
        "fatal: [controller-2]: UNREACHABLE! => {\"changed\": false, \"msg\": \"Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \\\"/tmp\\\". Failed command was: ( umask 77 && mkdir -p \\\"` echo /tmp/ceph_ansible_tmp/ansible-tmp-1540984535.5-252804450955403 `\\\" && echo ansible-tmp-1540984535.5-252804450955403=\\\"` echo /tmp/ceph_ansible_tmp/ansible-tmp-1540984535.5-252804450955403 `\\\" ), exited with result 1\", \"unreachable\": true}", 
        "PLAY RECAP *********************************************************************", 
        "ceph-0                     : ok=2    changed=0    unreachable=0    failed=0   ", 
        "ceph-1                     : ok=2    changed=0    unreachable=0    failed=0   ", 
        "ceph-2                     : ok=2    changed=0    unreachable=0    failed=0   ", 
        "compute-0                  : ok=2    changed=0    unreachable=0    failed=0   ", 
        "compute-1                  : ok=2    changed=0    unreachable=0    failed=0   ", 
        "controller-0               : ok=2    changed=0    unreachable=0    failed=0   ", 
        "controller-1               : ok=2    changed=0    unreachable=0    failed=0   ", 
        "controller-2               : ok=44   changed=4    unreachable=1    failed=0   ", 


Version-Release number of selected component (if applicable):
-------------------------------------------------------------
openstack-tripleo-heat-templates-9.0.1-0.20181013060864.ffbe879.el7ost.noarch
ceph-ansible-3.1.5-1.el7cp.noarch

Steps to Reproduce:
-------------------
1. Upgrade undercloud to 14
2. Follow upgrade procedure for overcloud upgrade
3. Run `overcloud upgrade converge` command at the end of upgrade.

Actual results:
---------------
Overcloud upgrade converge failed

Expected results:
-----------------
Overcloud upgrade converge succeeds

Additional info:
----------------
Virtual env: 3controllers + 2computes + 3ceph

Comment 2 John Fulton 2018-10-31 15:03:23 UTC
The remote_user is set to tripleo-admin http://ix.io/1qxa 
The remote_temp_dir is owned by heat-admin http://ix.io/1qx6 
Thus, tripleo-admin cannot write to the remote_temp_dir

The bug is that ansible runs should be run not as heat-admin but as tripleo-admin.

If the upgrade run is happening as heat-admin, then hopefully that can be changed to tripleo-admin before the ceph-ansible step is run on converge.

Comment 9 errata-xmlrpc 2019-01-11 11:54:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:0045


Note You need to log in before you can comment on or make changes to this bug.