Description of problem: take-over-existing-cluster.yml fails due to a bug introduced in https://github.com/ceph/ceph-ansible/commit/4639d89231dc35c743ded29ed0f962f36a4b0574. This task now basically stats the string of the cluster name and so the next task fails due to missing dict values: https://github.com/ceph/ceph-ansible/blob/4639d89231dc35c743ded29ed0f962f36a4b0574/infrastructure-playbooks/take-over-existing-cluster.yml#L35-L38 Version-Release number of selected component (if applicable): ceph-ansible-2.2.11-1 How reproducible: Every time Steps to Reproduce: 1. Install ceph-ansible-2.2.11-1.el7scon.noarch 2. Follow downstream docs to take over an existing cluster Actual results: Playbook fails Expected results: Successful new ceph.conf creation Additional info: We also *really* should be making backups of the original ceph.conf IMO. I'm working on this in a branch on upstream ceph-ansible.
https://github.com/ceph/ceph-ansible/pull/1613 should fix this.
Thomas would you please tell us how you are tracking work to be released in the next 2.x async release? This is a candidate for that.
Running take-over-existing-cluster is valid in 2.x as we changed from ceph-deploy (in 1.3) to ceph-ansible in 2.0 and had to provide a way for customer to make sure the existing cluster is brought under ceph-ansible control to handle management tasks like add/remove osds. In 3.0, the assumption is that either cluster is installed newly or upgraded from 2.x. In both cases, take over use case is not applicable in 3.0.