Bug 1705248

Summary: Overcloud deployment with Ceph fails during TASK [Remove ceph-ansible fetch directory]
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: openstack-tripleo-heat-templatesAssignee: John Fulton <johfulto>
Status: CLOSED ERRATA QA Contact: Yogev Rabl <yrabl>
Severity: high Docs Contact:
Priority: high    
Version: 15.0 (Stein)CC: dbecker, elicohen, fpantano, gfidente, johfulto, jvisser, mburns, morazi, scohen, ssmolyak
Target Milestone: betaKeywords: Triaged
Target Release: 15.0 (Stein)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-10.5.1-0.20190509160423.4dac4dc.el8ost.noarch Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-09-21 11:21:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1702498    

Description Marius Cornea 2019-05-01 19:45:17 UTC
Description of problem:
Overcloud deployment with Ceph fails during TASK [Remove ceph-ansible fetch directory]:

2019-05-01 19:38:58,827 p=436 u=mistral |  TASK [backup temporary ceph-ansible fetch directory tarball in swift] **********
2019-05-01 19:38:58,827 p=436 u=mistral |  Wednesday 01 May 2019  19:38:58 +0000 (0:00:00.493)       0:28:24.689 ********* 
2019-05-01 19:38:59,137 p=436 u=mistral |  changed: [undercloud] => {"changed": true, "cmd": "curl  -s -o /dev/null -w '%{http_code}' -X PUT -T /tmp/temporary_dir_new.tar.gz \"https://192.168.24.2:13808/v1/AUTH_bf04d89ff9fd45c6a7d8359edc7507a0/overcloud_ceph_ansible_fetch_dir/temporary_dir.tar.gz?temp_url_sig=2a987b8436c0ec20382ffdc44b4314aad3c21d82&temp_url_expires=1556823729\"", "delta": "0:00:00.054594", "end": "2019-05-01 19:38:59.111408", "rc": 0, "start": "2019-05-01 19:38:59.056814", "stderr": "", "stderr_lines": [], "stdout": "201", "stdout_lines": ["201"]}
2019-05-01 19:38:59,197 p=436 u=mistral |  TASK [ensure we were able to backup temporary fetch directory to swift] ********
2019-05-01 19:38:59,197 p=436 u=mistral |  Wednesday 01 May 2019  19:38:59 +0000 (0:00:00.370)       0:28:25.059 ********* 
2019-05-01 19:38:59,234 p=436 u=mistral |  skipping: [undercloud] => {"changed": false, "skip_reason": "Conditional result was False"}
2019-05-01 19:38:59,291 p=436 u=mistral |  TASK [clean temporary fetch directory after swift backup] **********************
2019-05-01 19:38:59,291 p=436 u=mistral |  Wednesday 01 May 2019  19:38:59 +0000 (0:00:00.093)       0:28:25.153 ********* 
2019-05-01 19:38:59,553 p=436 u=mistral |  changed: [undercloud] => {"changed": true, "path": "/tmp/temporary_dir_new.tar.gz", "state": "absent"}
2019-05-01 19:38:59,616 p=436 u=mistral |  TASK [Remove ceph-ansible fetch directory] *************************************
2019-05-01 19:38:59,616 p=436 u=mistral |  Wednesday 01 May 2019  19:38:59 +0000 (0:00:00.324)       0:28:25.477 ********* 
2019-05-01 19:38:59,868 p=436 u=mistral |  fatal: [undercloud]: FAILED! => {"changed": false, "msg": "rmtree failed: [Errno 13] Permission denied: 'ceph.mgr.controller-0.keyring'"}
2019-05-01 19:38:59,869 p=436 u=mistral |  NO MORE HOSTS LEFT *************************************************************
2019-05-01 19:38:59,870 p=436 u=mistral |  PLAY RECAP *********************************************************************
2019-05-01 19:38:59,871 p=436 u=mistral |  ceph-0                     : ok=141  changed=55   unreachable=0    failed=0    skipped=380  rescued=0    ignored=1   
2019-05-01 19:38:59,871 p=436 u=mistral |  ceph-1                     : ok=141  changed=55   unreachable=0    failed=0    skipped=380  rescued=0    ignored=1   
2019-05-01 19:38:59,871 p=436 u=mistral |  ceph-2                     : ok=141  changed=55   unreachable=0    failed=0    skipped=380  rescued=0    ignored=1   
2019-05-01 19:38:59,871 p=436 u=mistral |  compute-0                  : ok=169  changed=78   unreachable=0    failed=0    skipped=352  rescued=0    ignored=1   
2019-05-01 19:38:59,871 p=436 u=mistral |  compute-1                  : ok=169  changed=78   unreachable=0    failed=0    skipped=352  rescued=0    ignored=1   
2019-05-01 19:38:59,871 p=436 u=mistral |  controller-0               : ok=233  changed=134  unreachable=0    failed=0    skipped=288  rescued=0    ignored=1   
2019-05-01 19:38:59,871 p=436 u=mistral |  controller-1               : ok=233  changed=134  unreachable=0    failed=0    skipped=288  rescued=0    ignored=1   
2019-05-01 19:38:59,871 p=436 u=mistral |  controller-2               : ok=233  changed=134  unreachable=0    failed=0    skipped=288  rescued=0    ignored=1   
2019-05-01 19:38:59,872 p=436 u=mistral |  undercloud                 : ok=42   changed=24   unreachable=0    failed=1    skipped=44   rescued=0    ignored=0   
2019-05-01 19:38:59,872 p=436 u=mistral |  Wednesday 01 May 2019  19:38:59 +0000 (0:00:00.256)       0:28:25.734 ********* 
2019-05-01 19:38:59,872 p=436 u=mistral |  =============================================================================== 


[root@undercloud-0 mistral]# ls -ld ./overcloud/ceph-ansible/fetch_dir/30fb5e4a-6c42-11e9-ac85-5254006282e6/etc/ceph/ceph.mgr.controller-0.keyring
-rw-r--r--. 1 root root 143 May  1 19:34 ./overcloud/ceph-ansible/fetch_dir/30fb5e4a-6c42-11e9-ac85-5254006282e6/etc/ceph/ceph.mgr.controller-0.keyring


Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-10.5.1-0.20190423085106.3f148c4.el8ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy OSP15 overcloud with Ceph enabled

Actual results:
Failure during TASK [Remove ceph-ansible fetch directory]

Expected results:
No failures.

Additional info:

Comment 4 Eliad Cohen 2019-07-16 14:20:22 UTC
Verified fixed

Comment 8 errata-xmlrpc 2019-09-21 11:21:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2811

Comment 9 John Fulton 2019-09-23 13:28:19 UTC
Clearing needinfo. No doctext necessary.