Bug 1438590
Summary: | [RFE] ceph-ansible should allow discovery of Ceph OSD devices by rule | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Ben England <bengland> |
Component: | Ceph-Ansible | Assignee: | Sébastien Han <shan> |
Status: | CLOSED WONTFIX | QA Contact: | ceph-qe-bugs <ceph-qe-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4.0 | CC: | abond, adeza, anharris, aschoen, ceph-eng-bugs, flucifre, gabrioux, gfidente, gmeno, jefbrown, jjoyce, johfulto, jschluet, jtaleric, nthomas, rsussman, sankarshan, slinaber, tnielsen, tvignaud, twilkins, vashastr |
Target Milestone: | rc | Keywords: | FutureFeature |
Target Release: | 4.* | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-07-18 16:03:53 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1644611 | ||
Bug Blocks: | 1438572, 1624388 |
Description
Ben England
2017-04-03 20:57:05 UTC
I filed a related problem report requesting that introspection capture /dev/disk/by-path name for each block device. https://bugs.launchpad.net/ironic/+bug/1679726 Device names can also be expressed as /dev/disk/by-id/wwn-<wwid> softlinks - the WWN is in introspection data today, but is different for every block device, so that OSD device names cannot be specified in a node-independent way. Use of /dev/disk/by-path or other softlinks depends on Joe Talerico's fix (now upstream) to https://bugs.launchpad.net/puppet-ceph/+bug/1677605 One alternative to this solution is that programs could be written to transform introspection data into deployment YAML files. The program then becomes the rule-based solution, instead of YAML syntax. The problem with this approach is that it is brittle - the YAML output by such a program will rapidly become out of date and requires re-introspection and YAML regeneration, whereas a rule-based approach can cover a variety of situations without requiring changes to YAML to avoid deploy failures (for example, if disks are added or removed). +1 on the idea of selecting OSD data disks using regexp/globing, thanks! here is an ugly bash prototype script that takes introspection data and turns it into a .csv table that specifies node-uuid,device-name,device-wwn . It does capture *all* eligible OSD drives, and in this respect it is better than what we have now, which is to assume that every OSD host has the same number of drives. This output data should be sufficient to generate yaml that is used as input to a deployment. It finds all the disks reported by introspection, and then filters out the system disk and disks that do not have the right size. All of this is available from introspection data. It does not require /dev/disk/by-path names. It outputs wwid identifier for eligible drives, so that we can use /dev/disk/by-id names for the disks that will persist across reboots, avoiding device name instability problems. http://perf1.perf.lab.eng.bos.redhat.com/bengland/public/openstack/introspect/generate-yaml.sh I had to comment out parts of it that used openstack commands to make it work with saved introspection data (had no live system at that point) obtained from prior "openstack baremetal introspection data save" commands. This bash script relies heavily on the "jq" JSON-parsing utility. A native python implementation would be much cleaner. -- output log -- ben@bene-laptop introspect]$ INTROSPECT_DIR=logs bash generate-yaml.sh | tee generate-yaml.log looking for device names of this form: sd*[a-z] looking for devices with this size (GB): 1999 rejected disk sda because it is the system disk 0 OSD drives found in node 9d8526d4-84f9-4068-a1d1-a073bb9783c6 rejected disk sda because it is the system disk 0 OSD drives found in node ee4e17cb-1c5b-41cf-add2-5cc58fdb038f ... rejected disk sda because it is the system disk rejected disk sdal because of size 500 36 OSD drives found in node 21e56a0a-d403-426e-aef9-a6c210dbb9c4 rejected disk sdak because it is the system disk rejected disk sdal because of size 500 36 OSD drives found in node beb01552-af9e-4781-97f7-3c51af7286fc ... rejected disk sdak because it is the system disk rejected disk sdal because of size 500 36 OSD drives found in node ec4caa56-2786-422b-9664-4fb77ec7e474 --- echo 972 OSD drives stored in logs/osd_drives.csv --- This new python script supercedes the ugly shell hack in comment .-2 . https://github.com/bengland2/openstack-osd-discovery since ceph-ansible is now being used to deploy Ceph with Openstack, would the implementation of this be in ceph-ansible or OOO? Anyway, ironic has done the introspection, we should be able to determine what devices to use from that, and ceph-ansible should be able to deploy OSDs on these devices if given the appropriate inputs in the inventory file. - openstack RFE 1438572 depends on ceph 1438590 - ceph RFE 1438590 blocks openstack RFE 1438572 Registered blueprint https://blueprints.launchpad.net/tripleo/+spec/osds-by-rule Correct Ben. The issue was already attached this BZ btw :). Untestable in the 3.2 timeframe, so targeting z1 with ceph-volume changes coming. *** Bug 1486537 has been marked as a duplicate of this bug. *** This is blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1644611 |