Description of problem: Hello, after upgrading his external ceph cluster to 5.2 (16.2.8-85.el8cp), and running a test deployment (so without modifying anything),it fails with: --- "fatal: [compute01 -> {{ groups[mon_group_name][0] }}]: FAILED! => {\"msg\": \"'dict object' has no attribute 'mons'\"}", --- that should come from --- 2022-12-01 14:03:17.194443 | 52540064-3d0a-466a-5e53-000000007ba9 | TIMING | tripleo-ceph-run-ansible : search output of ceph-ansible run(s) non-zero return codes | undercloud | 0:12:41.597316 | 0.11s 2022-12-01 14:03:17.195019 | 52540064-3d0a-466a-5e53-000000007ba9 | OK | search output of ceph-ansible run(s) non-zero return codes | undercloud 2022-12-01 14:03:17.195200 | 52540064-3d0a-466a-5e53-000000007ba9 | TIMING | tripleo-ceph-run-ansible : search output of ceph-ansible run(s) non-zero return codes | undercloud | 0:12:41.598081 | 0.11s 2022-12-01 14:03:17.248733 | 52540064-3d0a-466a-5e53-000000007baa | TASK | print ceph-ansible output in case of failure 2022-12-01 14:03:17.307166 | 52540064-3d0a-466a-5e53-000000007baa | FATAL | print ceph-ansible output in case of failure | undercloud | error={ "ceph_ansible_std_out_err": [ "Using /usr/share/ceph-ansible/ansible.cfg as config file", "[WARNING]: Skipping key (deprecated) in group (overcloud) as it is not a", "mapping, it is a <class 'ansible.parsing.yaml.objects.AnsibleUnicode'>", "[WARNING]: Could not match supplied host pattern, ignoring: mons", "[WARNING]: Could not match supplied host pattern, ignoring: osds", "[WARNING]: Could not match supplied host pattern, ignoring: mdss", "[WARNING]: Could not match supplied host pattern, ignoring: rgws", "[WARNING]: Could not match supplied host pattern, ignoring: nfss", "[WARNING]: Could not match supplied host pattern, ignoring: rbdmirrors", "[WARNING]: Could not match supplied host pattern, ignoring: iscsigws", "[WARNING]: Could not match supplied host pattern, ignoring: mgrs", "[WARNING]: Could not match supplied host pattern, ignoring: monitoring", "", "PLAY [mons,osds,mdss,rgws,nfss,rbdmirrors,clients,iscsigws,mgrs,monitoring] ****", "TASK [check for python] ********************************************************", "Thursday 01 December 2022 14:03:08 +0000 (0:00:00.034) 0:00:00.034 ***** ", "ok: [compute01] => (item=/usr/bin/python) => {\"ansible_loop_var\": \"item\" -------------------------- In the templates there's only the ceph client part template eg. : --- parameter_defaults: CephClientKey: AQDq8I1jTM/dOhAA5zGIsqhJAc18Adt6OeA7jQ== CephClusterFSID: ac628a51-ced3-4991-95dc-1e0f26a2a34f CephExternalMonHost: fd00:fd00:fd00:3000::61,fd00:fd00:fd00:3000::62,fd00:fd00:fd00:3000::63 --- I will attach relevant files in a private comment. Version-Release number of selected component (if applicable): ceph-ansible-6.0.27.9-1.el8cp.noarch ansible-2.9.27-1.el8ae.noarch tripleo-ansible-0.8.1-2.20220406160113.2d0ab9a.el8ost.noarch external ceph versions: 16.2.8-85.el8cp How reproducible: On customer environment Steps to Reproduce: 1. 2. 3. Actual results: Deployment fails on ceph ansible Expected results: Deployment succeeds Additional info:
(In reply to Luca Davidde from comment #0) > Description of problem: > Hello, > after upgrading his external ceph cluster to 5.2 (16.2.8-85.el8cp), and > running a test deployment (so without modifying anything),it fails with: > > --- > "fatal: [compute01 -> {{ groups[mon_group_name][0] }}]: FAILED! => {\"msg\": > \"'dict object' has no attribute 'mons'\"}", <snip> > 2022-12-01 14:03:17.194443 | 52540064-3d0a-466a-5e53-000000007ba9 | > TIMING | tripleo-ceph-run-ansible : search output of ceph-ansible run(s) <snip> > Version-Release number of selected component (if applicable): > ceph-ansible-6.0.27.9-1.el8cp.noarch > ansible-2.9.27-1.el8ae.noarch > tripleo-ansible-0.8.1-2.20220406160113.2d0ab9a.el8ost.noarch Did the customer upgrade ceph-ansible on their OSP16.2 undercloud to ceph-ansible-6.0.27.9-1.el8cp.noarch? I think that's the problem. Please downgrade ceph-ansible on the undercloud to the lastest version from the repository rhceph-4-tools-for-rhel-8-x86_64-rpms. As or right now that's: https://access.redhat.com/downloads/content/ceph-ansible/4.0.70.18-1.el8cp/noarch/fd431d51/package Then re-run the stack update and update this BZ with the results. I'm setting needinfo since that's the information we need next to keep this bug moving. Explanation: tripleo-ansible has been tested with the version of ceph-ansible from RHCSv4 but not with RHCSv5. ceph-ansible-6.0.27.9-1.el8cp.noarch comes from rhceph-5-tools-for-rhel-8-x86_64-rpms A newer ceph-ansible may be used to manage the external ceph cluster running RHCSv5 (before they migrate to cephadm) but the undercloud's ceph-ansible only configures the ceph clients and doesn't need to be upgraded.
Should be fixed by https://github.com/ceph/ceph-ansible/commit/534fdd9958f51af9570b342ad706cf0d358afb4c