Created attachment 1384266 [details] File contains contents ansible-playbook log and contents of all.yml and inventory file Description of problem: playbook gets stuck for indefinite time checking whether ceph is running already It seemed that existing mon was trying to contact other mons whereas other mons were not yet configured. Version-Release number of selected component (if applicable): ceph-ansible-3.0.18-1.el7cp.noarch How reproducible: Always (2/2) Steps to Reproduce: 1. Configure ceph-ansible to initialize a ceph-cluster with more than one mon. 2. Run playbook. Actual results: playbook gets stuck at task - TASK [ceph-defaults : is ceph running already?] Expected results: Cluster must be initialized Additional info: Observed that ansible seemed to be waiting for "sudo docker exec ceph-mon-magna029 ceph --cluster c1 fsid --connect-timeout 3" to finish its execution but manual effort to observe outcome of above command also resulted the same. This might be an issue with "--connect-timeout 3" as without this argument command gets TimedOut after 300 seconds as other mons were not yet configured.
@Guillaume/Sebastien, can you please let us know when can we expect the fix for this?
@Harish, the environment you provided to debug this issue has been reset, since I can't reproduce this issue on my env any chance you can get your environment back to the state it was yesterday so I can reproduce this bug?
@Guillaume, we had provided the test system yesterday evening. But today morning we needed it for moving ahead with other 2.5 testing. Please get in touch with Vasi for the system with issue reproduced in it.
Hi Guillaume, We have hit the issue 1537003 on one more setup (we have emailed you the details of this setup) while trying to install IPv6 based RHEL ceph cluster. Please check the system and let us know if you need more info. Regards, Harish
Hi Harish, Thanks for the details, I'm currently taking a look at this.
One thing I noticed in your env is that you have set monitor_address variable in all.yml like this : monitor_address: 2620:52:0:880:225:90ff:fefc:2770 The result is that your ceph.conf contains this on all monitors : mon host = [2620:52:0:880:225:90ff:fefc:2770],[2620:52:0:880:225:90ff:fefc:2770],[2620:52:0:880:225:90ff:fefc:2770],[2620:52:0:880:225:90ff:fefc:2770] As mentioned here :https://github.com/ceph/ceph-ansible/blob/stable-3.0/group_vars/all.yml.sample#L313 monitor_address should be used in inventory host file to set for each monitor the address they will bind on. Could you try using monitor_address that way, or using monitor_interface in all.yml instead ?
Hi Guillaume, Thanks for pointing out and correcting it. I've hit this issue in container scenario - please check magna071 (admin and mon) Please update once we can use the setup. Regards, Vasishta
Hi Vasishta, You can use the setup, I'm not using it anymore. The fix for this issue will be in v3.0.20
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0340