Description of problem: Ceph cluster creation is failing with latest ceph-ansible builds on one of following error: OSD creation/configuration failure: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ RUNNING HANDLER [ceph.ceph-common : restart ceph osds daemon(s)] *************** fatal: [osd2.localdomain]: FAILED! => { "changed": true, "cmd": [ "/tmp/restart_osd_daemon.sh" ], "delta": "0:20:10.020056", "end": "2017-04-20 22:26:30.616511", "failed": true, "rc": 1, "start": "2017-04-20 22:06:20.596455", "warnings": [] } STDOUT: Error with PGs, check config ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ or Monitor configuration failure ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2017-04-21 06:19:51.708796 7f7d8c243700 0 -- :/3634752360 >> IP:6789/0 pipe(0x7f7d8805ab00 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f7d8805c730).fault fatal: [mon1.localdomain]: FAILED! => { "changed": true, "cmd": [ "/tmp/restart_mon_daemon.sh" ], "delta": "0:00:10.042757", "end": "2017-04-21 06:20:00.846509", "failed": true, "rc": 1, "start": "2017-04-21 06:19:50.803752", "warnings": [] } STDOUT: Error while restarting mon daemon STDERR: Job for ceph-mon failed. See "systemctl status ceph-mon" and "journalctl -xe" for details. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Version-Release number of selected component (if applicable): ansible-2.2.2.0-1.el7.noarch ceph-ansible-2.2.1-1.el7scon.noarch ceph-installer-1.3.0-1.el7scon.noarch rhscon-ceph-0.0.43-1.el7scon.x86_64 rhscon-core-0.0.45-1.el7scon.x86_64 rhscon-core-selinux-0.0.45-1.el7scon.noarch rhscon-ui-0.0.60-1.el7scon.noarch How reproducible: It happened on every tested cluster, but no on each OSD/MON. Steps to Reproduce: 1. Prepare and install machines for Ceph cluster managed by RHSCON 2 (Skyring). 2. Create Ceph cluster via RHSCON Web UI. 3. Check the "Create Cluster" task. 4. Check ceph-installer tasks. $ curl http://rhscon-server.example.com:8181/api/tasks/ | jq . Actual results: Cluster creation task in RHSCon contains one of following errors: Failed to add mon(s) [mon3.localdomain] or OSD addition failed for [osd1.localdomain:map[/dev/vdc:/dev/vdb] osd2.localdomain:map[/dev/vdd:/dev/vdc]... Related tasks in ceph-ansible contains errors mentioned above in the description. Expected results: Ceph cluster is properly created with all expected monitors and osds. Additional info:
Currently addressed upstream here: https://github.com/ceph/ceph-ansible/pull/1455 This will be merged today and you will get it from the next package build. Stay tuned.
Merged to master, looking to get this backported to the stable-2.2 branch
Waiting for CI to pass on stable-2.2 with the backport: https://github.com/ceph/ceph-ansible/pull/1467
We need v2.2.2 tagged upstream with this change.
Expect a new tag in the next hour.
Tested and VERIFIED by automatic test suite on: calamari-server-1.5.6-1.el7cp.x86_64 Cluster creation works as expected. >> VERIFIED
(In reply to Daniel Horák from comment #8) > Tested and VERIFIED by automatic test suite on: > calamari-server-1.5.6-1.el7cp.x86_64 Related package is of course: ceph-ansible-2.2.2-1.el7scon.noarch > Cluster creation works as expected. > > >> VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1496