Bug 1496985
Summary: | When attempting to deploy Ceph with director, ceph-disk prepare fails one out of five times | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | John Fulton <johfulto> |
Component: | rhosp-director | Assignee: | John Fulton <johfulto> |
Status: | CLOSED DUPLICATE | QA Contact: | Yogev Rabl <yrabl> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 12.0 (Pike) | CC: | afazekas, dbecker, gfidente, jefbrown, jomurphy, kdreyer, martinsson.patrik, mburns, morazi, rhel-osp-director-maint, scohen |
Target Milestone: | beta | Keywords: | Triaged |
Target Release: | 12.0 (Pike) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-10-31 09:38:52 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1491780, 1496509, 1498439 | ||
Bug Blocks: |
Description
John Fulton
2017-09-28 21:26:31 UTC
This is caused by a race condition in ceph-disk as tracked by the following: https://bugzilla.redhat.com/show_bug.cgi?id=1491780 https://bugzilla.redhat.com/show_bug.cgi?id=1494543 The RPM providing ceph-disk will ship as a docker container so new docker containers from ceph will be necessary to address this issue. We are tracking inclusion of BZ 1496509 into OSP via BZ 1484447 already. *** This bug has been marked as a duplicate of bug 1484447 *** This is 'fix' (https://github.com/ceph/ceph/pull/14329/files) is not included in the package provided for red hat storage 3 (ceph-base-12.2.1-40.el7cp.x86_64) which makes it impossible to deploy in our 18 node cluster. # On the "ansible director" $ > rpm -q ceph-ansible ceph-ansible-3.0.14-1.el7cp.noarch # In the one of the successfully deployed containers, $ > rpm -qf $(which ceph-disk) ceph-base-12.2.1-40.el7cp.x86_64 When we use ceph-ansible to deploy in our environment, it always fails on 2-3 nodes. I'm not sure I understand from the above comments, is this still a problem in Red Hat Storage 3 (because from what I can see, the patch mentioned here isn't applied in tthe '/usr/lib/python2.7/site-packages/ceph_disk/main.py' from 'ceph-base-12.2.1-40.el7cp.x86_64'. The error message I get is exactly the same as the reported one so I'm pretty sure this is the bug we are hitting. Any input on this would be good. Best regards, Patrik Martinsson Sweden |