Description of problem: The ansible-playbook executed with "--limit" flag for all types of nodes except mons will (re)create content of /etc/ceph/ceph.conf that is not complete: /etc/ceph/ceph.conf after deployment with command "ansible-playbook site.yml" ----------------- [root@rhscaosd5 ~]# cat /etc/ceph/ceph.conf # Please do not change this file directly since it is managed by Ansible and will be overwritten [global] fsid = a93b8b8c-a7fe-4103-8434-6a490f641a66 max open files = 131072 mon initial members = rhscamon1,rhscamon2,rhscamon3 mon host = 192.168.66.157,192.168.66.251,192.168.66.149 public network = 192.168.66.0/24 cluster network = 192.168.66.0/24 [client.libvirt] admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok # must be writable by QEMU and allowed by SELinux or AppArmor log file = /var/log/ceph/qemu-guest-$pid.log # must be writable by QEMU and allowed by SELinux or AppArmor [osd] osd mkfs type = xfs osd mkfs options xfs = -f -i size=2048 osd mount options xfs = noatime,largeio,inode64,swalloc osd journal size = 1024 ----------------- /etc/ceph/ceph.conf after re-run with command "ansible-playbook site.yml --limit osds" , where node rhscaosd5 is an osd node listed in /etc/ansible/hosts under tag [osds] ----------------- [root@rhscaosd5 ~]# cat /etc/ceph/ceph.conf # Please do not change this file directly since it is managed by Ansible and will be overwritten [global] fsid = a93b8b8c-a7fe-4103-8434-6a490f641a66 max open files = 131072 mon initial members = ,, mon host = ,, public network = 192.168.66.0/24 cluster network = 192.168.66.0/24 [client.libvirt] admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok # must be writable by QEMU and allowed by SELinux or AppArmor log file = /var/log/ceph/qemu-guest-$pid.log # must be writable by QEMU and allowed by SELinux or AppArmor [osd] osd mkfs type = xfs osd mkfs options xfs = -f -i size=2048 osd mount options xfs = noatime,largeio,inode64,swalloc osd journal size = 1024 ----------------- Version-Release number of selected component (if applicable): ceph-ansible-2.2.11-1.el7scon.noarch ansible-2.2.3.0-1.el7.noarch How reproducible: Always Steps to Reproduce: 1. Deploy ceph cluster with ceph-ansible: "ansible-playbook site.yml" 2. re-run deployment with flag "--limit" and option other than "mons", like "ansible-playbook site.yml --limit osds" 3. observe /etc/ceph/ceph.conf on nodes from group used in step 2. Actual results: /etc/ceph/ceph.conf is (re)created with missing values: mon initial members = ,, mon host = ,, Expected results: /etc/ceph/ceph.conf has correct values Additional info:
Unfortunately, this is expected. We can not use 'limit' since the ceph.conf needs to know the monitors in order to be built correctly. I'm closing this as won't fix. This behaviour can not be changed, this is an Ansible limitation. Thanks for your understanding.
My bad, I wasn't aware of this feature, let me have a look into this. Thanks!
See: https://github.com/ceph/ceph-ansible/pull/1801
See, if they can try it out it'd be even better :)
(In reply to seb from comment #7) > See, if they can try it out it'd be even better :) I will give it a try, but there is lot of changes between downstream version and upstream, so simple c&p of the site.yml.sample from https://github.com/ceph/ceph-ansible/pull/1801 fails right away.
Even if they are a lot of changes that are not a problem. What's the failure?
(In reply to seb from comment #9) > Even if they are a lot of changes that are not a problem. What's the failure? this a run with site.yml.sample + ceph-ansible-2.2.11-1.el7scon.noarch, ansible-2.2.3.0-1.el7.noarch # ansible-playbook site.BZ1482067.yml --limit osds ERROR! the role 'ceph-defaults' was not found in /usr/share/ceph-ansible/roles:/usr/share/ceph-ansible/roles:/usr/share/ceph-ansible The error appears to have been in '/usr/share/ceph-ansible/site.BZ1482067.yml': line 58, column 7, but may be elsewhere in the file depending on the exact syntax problem. The offending line appears to be: roles: - ceph-defaults ^ here there is no such role in /usr/share/ceph-ansible/roles/ ----------------------------------------------- if I will create Frankenstein's monster and just add/remove the lines inthe commit to site.yml.sample delivered with downstream version it fails with : $ ansible-playbook site.yml.sample --limit osds ERROR! Syntax Error while loading YAML. The error appears to have been in '/usr/share/ceph-ansible/site.yml.sample': line 32, column 6, but may be elsewhere in the file depending on the exact syntax problem. The offending line appears to be: - name: install python2 for Fedora ^ here ------- I haven't tried that further
Right, ceph-defaults doesn't exist (yet) for you. So yes remove all the occurrence. About the error " - name: install python2 for Fedora" could this be an indent issue? Can you try to debug further? Thanks!
(In reply to seb from comment #11) > Right, ceph-defaults doesn't exist (yet) for you. So yes remove all the > occurrence. > About the error " - name: install python2 for Fedora" could this be an > indent issue? Can you try to debug further? > > Thanks! Ok, that was mistake from my side, I have messed up the site.yml content. ------ Anyway, new try with just these changes, : [root@rhsca ceph-ansible]# diff site.yml site.yml.origin 35c35 < - name: gather and delegate facts --- > - name: gathering facts 37,39d36 < delegate_to: "{{ item }}" < delegate_facts: True < with_items: "{{ groups['all'] }}" 42,44c39 < when: < - ansible_distribution == 'Fedora' < - ansible_distribution_major_version|int >= 23 --- > when: ansible_distribution == 'Fedora' and ansible_distribution_major_version|int >= 23 .... TASK [install required packages for Fedora > 23] ******************************* task path: /usr/share/ceph-ansible/site.yml:40 fatal: [rhscaosd5]: FAILED! => { "failed": true, "msg": "The conditional check 'ansible_distribution == 'Fedora'' failed. The error was: error while evaluating conditional (ansible_distribution == 'Fedora'): 'ansible_distribution' is undefined\n\nThe error appears to have been in '/usr/share/ceph-ansible/site.yml': line 40, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n with_items: \"{{ groups['all'] }}\"\n - name: install required packages for Fedora > 23\n ^ here\n" }
Created attachment 1318073 [details] ansible-playbook site.yml -vvvvv | tee /tmp/site.yml.1482067
> .... > > TASK [install required packages for Fedora > 23] > ******************************* > task path: /usr/share/ceph-ansible/site.yml:40 > fatal: [rhscaosd5]: FAILED! => { > "failed": true, > "msg": "The conditional check 'ansible_distribution == 'Fedora'' failed. > The error was: error while evaluating conditional (ansible_distribution == > 'Fedora'): 'ansible_distribution' is undefined\n\nThe error appears to have > been in '/usr/share/ceph-ansible/site.yml': line 40, column 7, but may\nbe > elsewhere in the file depending on the exact syntax problem.\n\nThe > offending line appears to be:\n\n with_items: \"{{ groups['all'] }}\"\n > - name: install required packages for Fedora > 23\n ^ here\n" > } The changes added to site.yml.sample for the facts delegation now requires that we use ansible >= 2.3. I believe what you're seeing above is a bug in ansible 2.2 and should go away when using ansible 2.3. Thanks, Andrew
(In reply to Andrew Schoen from comment #14) > > .... > The changes added to site.yml.sample for the facts delegation now requires > that we use ansible >= 2.3. I believe what you're seeing above is a bug in > ansible 2.2 and should go away when using ansible 2.3. > > Thanks, > Andrew Hi Andrew, you are right, thank you for pointing on that. So I have updated ansible and give it another try: # rpm -qa | grep ansible ceph-ansible-2.2.11-1.el7scon.noarch ansible-2.3.1.0-3.el7.noarch # diff site.yml.BZ1482067 site.yml.origin 35c35 < - name: gather and delegate facts --- > - name: gathering facts 37,39d36 < delegate_to: "{{ item }}" < delegate_facts: True < with_items: "{{ groups['all'] }}" # ansible-playbook site.yml.BZ1482067 --limit osds And it worked well, the ceph.conf has correct values: [root@rhscaosd5 ~]# cat /etc/ceph/ceph.conf # Please do not change this file directly since it is managed by Ansible and will be overwritten [global] fsid = a93b8b8c-a7fe-4103-8434-6a490f641a66 max open files = 131072 mon initial members = rhscamon1,rhscamon2,rhscamon3 mon host = 192.168.66.157,192.168.66.251,192.168.66.149 public network = 192.168.66.0/24 cluster network = 192.168.66.0/24 [client.libvirt] admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok # must be writable by QEMU and allowed by SELinux or AppArmor log file = /var/log/ceph/qemu-guest-$pid.log # must be writable by QEMU and allowed by SELinux or AppArmor [osd] osd mkfs type = xfs osd mkfs options xfs = -f -i size=2048 osd mount options xfs = noatime,largeio,inode64,swalloc osd journal size = 1024
(weird issue: GitHub says https://github.com/ceph/ceph-ansible/pull/1801 was merged in 02d849d2371afee242a6913473805f5e7522c9ae. I can't find that commit when I fetch today.) git tag --contains 5bda515d7ca185e8feedc031624ff4b073caa728 says this has been fixed since 3.0.0rc4.
lgtm
The configuration file is displays proper values, with ansible-playbook site.yml --limit osds|clients|rgws Moving this BZ to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3387