Bug 2151285 - [RHOSP 16.2.3] ceph ansible fails with - 'dict object' has no attribute 'mons' - with external ceph cluster
Summary: [RHOSP 16.2.3] ceph ansible fails with - 'dict object' has no attribute 'mons...
Keywords:
Status: POST
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 6.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.3z2
Assignee: Guillaume Abrioux
QA Contact: Manisha Saini
URL:
Whiteboard:
Depends On:
Blocks: 1760354
TreeView+ depends on / blocked
 
Reported: 2022-12-06 15:26 UTC by Luca Davidde
Modified: 2023-01-10 04:35 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph/ceph-ansible/commit/534fdd9958f51af9570b342ad706cf0d358afb4c 0 None None None 2023-01-03 16:02:33 UTC
Red Hat Issue Tracker RHCEPH-5744 0 None None None 2022-12-06 15:38:05 UTC

Description Luca Davidde 2022-12-06 15:26:34 UTC
Description of problem:
Hello,
after upgrading his external ceph cluster to 5.2 (16.2.8-85.el8cp), and running a test deployment (so without modifying anything),it fails with:

---
"fatal: [compute01 -> {{ groups[mon_group_name][0] }}]: FAILED! => {\"msg\": \"'dict object' has no attribute 'mons'\"}",
---

that should come from

---

2022-12-01 14:03:17.194443 | 52540064-3d0a-466a-5e53-000000007ba9 |     TIMING | tripleo-ceph-run-ansible : search output of ceph-ansible run(s) non-zero return codes | undercloud | 0:12:41.597316 | 0.11s
2022-12-01 14:03:17.195019 | 52540064-3d0a-466a-5e53-000000007ba9 |         OK | search output of ceph-ansible run(s) non-zero return codes | undercloud
2022-12-01 14:03:17.195200 | 52540064-3d0a-466a-5e53-000000007ba9 |     TIMING | tripleo-ceph-run-ansible : search output of ceph-ansible run(s) non-zero return codes | undercloud | 0:12:41.598081 | 0.11s
2022-12-01 14:03:17.248733 | 52540064-3d0a-466a-5e53-000000007baa |       TASK | print ceph-ansible output in case of failure
2022-12-01 14:03:17.307166 | 52540064-3d0a-466a-5e53-000000007baa |      FATAL | print ceph-ansible output in case of failure | undercloud | error={
    "ceph_ansible_std_out_err": [
        "Using /usr/share/ceph-ansible/ansible.cfg as config file",
        "[WARNING]: Skipping key (deprecated) in group (overcloud) as it is not a",
        "mapping, it is a <class 'ansible.parsing.yaml.objects.AnsibleUnicode'>",
        "[WARNING]: Could not match supplied host pattern, ignoring: mons",
        "[WARNING]: Could not match supplied host pattern, ignoring: osds",
        "[WARNING]: Could not match supplied host pattern, ignoring: mdss",
        "[WARNING]: Could not match supplied host pattern, ignoring: rgws",
        "[WARNING]: Could not match supplied host pattern, ignoring: nfss",
        "[WARNING]: Could not match supplied host pattern, ignoring: rbdmirrors",
        "[WARNING]: Could not match supplied host pattern, ignoring: iscsigws",
        "[WARNING]: Could not match supplied host pattern, ignoring: mgrs",
        "[WARNING]: Could not match supplied host pattern, ignoring: monitoring",
        "",
        "PLAY [mons,osds,mdss,rgws,nfss,rbdmirrors,clients,iscsigws,mgrs,monitoring] ****",
        "TASK [check for python] ********************************************************",
        "Thursday 01 December 2022  14:03:08 +0000 (0:00:00.034)       0:00:00.034 ***** ",
        "ok: [compute01] => (item=/usr/bin/python) => {\"ansible_loop_var\": \"item\"
--------------------------

In the templates there's only the ceph client part template eg. :
---
parameter_defaults:
 CephClientKey: AQDq8I1jTM/dOhAA5zGIsqhJAc18Adt6OeA7jQ==
 CephClusterFSID: ac628a51-ced3-4991-95dc-1e0f26a2a34f
 CephExternalMonHost: fd00:fd00:fd00:3000::61,fd00:fd00:fd00:3000::62,fd00:fd00:fd00:3000::63
---

I will attach relevant files in a private comment.



Version-Release number of selected component (if applicable):
ceph-ansible-6.0.27.9-1.el8cp.noarch
ansible-2.9.27-1.el8ae.noarch
tripleo-ansible-0.8.1-2.20220406160113.2d0ab9a.el8ost.noarch

external ceph versions:

16.2.8-85.el8cp

How reproducible:
On customer environment

Steps to Reproduce:
1.
2.
3.

Actual results:
Deployment fails on ceph ansible

Expected results:
Deployment succeeds

Additional info:

Comment 10 John Fulton 2022-12-08 12:07:53 UTC
(In reply to Luca Davidde from comment #0)
> Description of problem:
> Hello,
> after upgrading his external ceph cluster to 5.2 (16.2.8-85.el8cp), and
> running a test deployment (so without modifying anything),it fails with:
> 
> ---
> "fatal: [compute01 -> {{ groups[mon_group_name][0] }}]: FAILED! => {\"msg\":
> \"'dict object' has no attribute 'mons'\"}",

 <snip>
> 2022-12-01 14:03:17.194443 | 52540064-3d0a-466a-5e53-000000007ba9 |    
> TIMING | tripleo-ceph-run-ansible : search output of ceph-ansible run(s)

 <snip> 

> Version-Release number of selected component (if applicable):
> ceph-ansible-6.0.27.9-1.el8cp.noarch
> ansible-2.9.27-1.el8ae.noarch
> tripleo-ansible-0.8.1-2.20220406160113.2d0ab9a.el8ost.noarch

Did the customer upgrade ceph-ansible on their OSP16.2 undercloud to ceph-ansible-6.0.27.9-1.el8cp.noarch?

I think that's the problem. Please downgrade ceph-ansible on the undercloud to the lastest version from the repository rhceph-4-tools-for-rhel-8-x86_64-rpms. As or right now that's:

  https://access.redhat.com/downloads/content/ceph-ansible/4.0.70.18-1.el8cp/noarch/fd431d51/package

Then re-run the stack update and update this BZ with the results. I'm setting needinfo since that's the information we need next to keep this bug moving.


Explanation:

tripleo-ansible has been tested with the version of ceph-ansible from RHCSv4 but not with RHCSv5. 

  ceph-ansible-6.0.27.9-1.el8cp.noarch comes from rhceph-5-tools-for-rhel-8-x86_64-rpms

A newer ceph-ansible may be used to manage the external ceph cluster running RHCSv5 (before they migrate to cephadm) but the undercloud's ceph-ansible only configures the ceph clients and doesn't need to be upgraded.


Note You need to log in before you can comment on or make changes to this bug.