Bug 1555305 - [CEE/SD][ceph-ansible][RHCS2] take-over-existing-cluster.yml fails with 'ceph_release_num.{{ ceph_release }} >= ceph_release_num.luminous'
Summary: [CEE/SD][ceph-ansible][RHCS2] take-over-existing-cluster.yml fails with 'cep...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 2.5
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z2
: 2.5
Assignee: Sébastien Han
QA Contact: Yogev Rabl
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-03-14 12:52 UTC by Tomas Petr
Modified: 2021-12-10 15:48 UTC (History)
13 users (show)

Fixed In Version: RHEL: ceph-ansible-3.0.45-1.el7cp Ubuntu: ceph-ansible_3.0.45-2redhat1
Doc Type: Bug Fix
Doc Text:
"ceph-defaults" now contains all Ceph default variables, making "var_files" obsolete. Additionally, having "roles/ceph-defaults/defaults/main.yml" and "group_vars/all.yml" will create a collision and override necessary variables.
Clone Of:
Environment:
Last Closed: 2018-09-05 19:39:32 UTC
Embargoed:


Attachments (Terms of Use)
ansible-playbook -vvv take-over-existing-cluster.yml (74.92 KB, text/plain)
2018-07-10 12:25 UTC, Tomas Petr
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 3037 0 None closed [skip ci] take-over-existing-cluster: do not call var_files 2020-02-29 12:57:56 UTC
Red Hat Issue Tracker RHCEPH-1557 0 None None None 2021-09-09 13:28:17 UTC
Red Hat Knowledge Base (Solution) 3378431 0 None None None 2018-03-14 12:56:05 UTC
Red Hat Product Errata RHBA-2018:2651 0 None None None 2018-09-05 19:40:25 UTC

Description Tomas Petr 2018-03-14 12:52:30 UTC
Description of problem:
ansible-playbook take-over-existing-cluster.yml fails with  'ceph_release_num.{{ ceph_release }} >= ceph_release_num.luminous' when running for RHCS 2 environment, configuration following RH documentation:

# egrep -v "^#|^$" group_vars/all.yml
upgrade_ceph_packages: True
ceph_rhcs_version: 2
journal_size: 5120
ceph_repository_type: cdn
ceph_rhcs: true
ceph_rhcs_cdn_install: true
ceph_origin: distro
monitor_interface: eth0
public_network: "10.74.156.0/22"
cluster_network: "192.168.1.0/28"

[root@mgmt-0 ceph-ansible]# ansible-playbook -vvvvvv take-over-existing-cluster.yml
...
TASK [ceph-fetch-keys : set_fact bootstrap_rbd_keyring] ***********************************************************************************************************************************************************
task path: /usr/share/ceph-ansible/roles/ceph-fetch-keys/tasks/main.yml:17
Read vars_file 'roles/ceph-defaults/defaults/main.yml'
Read vars_file 'group_vars/all.yml'
 [WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: ceph_release_num.{{ ceph_release }} >= ceph_release_num.luminous

Read vars_file 'roles/ceph-defaults/defaults/main.yml'
Read vars_file 'group_vars/all.yml'
 [WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: ceph_release_num.{{ ceph_release }} >= ceph_release_num.luminous

fatal: [10.74.157.20]: FAILED! => {
    "failed": true, 
    "msg": "The conditional check 'ceph_release_num.{{ ceph_release }} >= ceph_release_num.luminous' failed. The error was: error while evaluating conditional (ceph_release_num.{{ ceph_release }} >= ceph_release_num.luminous): 'dict object' has no attribute 'dummy'\n\nThe error appears to have been in '/usr/share/ceph-ansible/roles/ceph-fetch-keys/tasks/main.yml': line 17, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: set_fact bootstrap_rbd_keyring\n  ^ here\n"
}
....


-------------------------
This can be worked around by adding to all.yml line:
ceph_stable_release: jewel
 - just to satisfy the condition

but more appropriate would be to be able to use already existing parameter "ceph_rhcs_version: 2" and translate it as "ceph_stable_release: jewel" 


Version-Release number of selected component (if applicable):
ceph-ansible-3.0.25-1.el7cp.noarch

How reproducible:
always

Steps to Reproduce:
1. deploy RHCS 2 env following the guide
2. run take-over-existing-cluster.yml
3. ends with fail above

4. add "ceph_stable_release: jewel" to all.yml, re-run take-over-existing-cluster.yml
5. success

Actual results:


Expected results:


Additional info:

Comment 4 Sébastien Han 2018-07-02 11:53:33 UTC
It's strange, was ceph-defaults played? It's supposed to populate ceph_stable_release for us so there is nothing to do.
Thanks.

Comment 5 Tomas Petr 2018-07-10 12:23:07 UTC
(In reply to leseb from comment #4)
> It's strange, was ceph-defaults played? It's supposed to populate
> ceph_stable_release for us so there is nothing to do.
> Thanks.

Hi Seb,
yes,ceph-defaults was played, but it did not populate ceph_stable_release:

.......
TASK [ceph-defaults : set_fact ceph_release ceph_stable_release] **************************************************************************************************************************************************
task path: /usr/share/ceph-ansible/roles/ceph-defaults/tasks/facts.yml:71
ok: [10.74.159.233] => {
    "ansible_facts": {
        "ceph_release": "dummy"
    }, 
    "changed": false, 
    "failed": false
}
Read vars_file 'roles/ceph-defaults/defaults/main.yml'
Read vars_file 'group_vars/all.yml'
Read vars_file 'roles/ceph-defaults/defaults/main.yml'
Read vars_file 'group_vars/all.yml'


[root@mgmt-0 ceph-ansible]# cat group_vars/all.yml
fsid: 145aaef6-be3e-4539-a7b8-33be7e4f9a3b
journal_size: 5120
ceph_rhcs: true
ceph_rhcs_cdn_install: true
generate_fsid: false
upgrade_ceph_packages: True
ceph_rhcs_version: 2  #<----- 
ceph_origin: distro
monitor_interface: eth0
public_network: "10.74.156.0/22"
cluster_network: "192.168.1.0/28"

Comment 6 Tomas Petr 2018-07-10 12:25:14 UTC
Created attachment 1457808 [details]
ansible-playbook -vvv take-over-existing-cluster.yml

Comment 10 Sébastien Han 2018-08-20 12:43:49 UTC
Sorry for being so late on this one, I just sent a patch to fix this.
Thanks for your patience.

Comment 12 Sébastien Han 2018-08-20 14:56:07 UTC
We are, I'm going to tag soon, today. :)

Comment 14 Ken Dreyer (Red Hat) 2018-08-21 18:47:30 UTC
This BZ is targeted to RH Ceph Storage 2, and we have no plans to ship ceph-ansible 3.1 there, so we'll need this in a stable-3.0 release upstream.

Comment 16 tserlin 2018-08-21 19:19:58 UTC
(In reply to Ken Dreyer (Red Hat) from comment #14)
> This BZ is targeted to RH Ceph Storage 2, and we have no plans to ship
> ceph-ansible 3.1 there, so we'll need this in a stable-3.0 release upstream.

The "take-over-existing-cluster: do not call var_files" commit is actually already in 3.0.44:

https://github.com/ceph/ceph-ansible/commits/v3.0.44

Thomas

Comment 22 errata-xmlrpc 2018-09-05 19:39:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2651


Note You need to log in before you can comment on or make changes to this bug.