Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1613933

Summary:

[Docs] [Ceph]The Ceph Guide for OpenStack should have a NodeDataLookup OSD list override example

Product:

Red Hat OpenStack

Reporter:

John Fulton <johfulto>

Component:

documentation

Assignee:

Kim Nylander <knylande>

Status:

CLOSED CURRENTRELEASE

QA Contact:

Laura Marsh <lmarsh>

Severity:

unspecified

Docs Contact:

Priority:

unspecified

Version:

13.0 (Queens)

CC:

knylande, srevivo, yrabl

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2018-11-13 23:23:05 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Heat Environment File which uses NodeDataLookup for Ceph deployment	none
Updated Heat Environment File which uses NodeDataLookup for Ceph deployment	none

Description John Fulton 2018-08-08 15:05:27 UTC

The Deploying an Overcloud with Containerized Red Hat Ceph document [1] section 5.1 covers Mapping the Ceph Storage Node Disk Layout. This section is good, but an additional section should be added which covers how to deal with scenarios in which a particular node may have a disk missing. TripleO supports this feature and it is documented upstream [1] but not downstream. 

This bug asks a new section be added to this document called something like "Mapping the Disk Layout to Non-Homogeneous Ceph Storage Nodes" which then explains what to do in this scenario and provides an example of how to do it.

I will update this bugzilla with content to help the above be written.


[1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html/deploying_an_overcloud_with_containerized_red_hat_ceph/

[2] https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/node_specific_hieradata.html

Comment 1 John Fulton 2018-08-08 15:07:43 UTC

The closest example we have to this already in our documentation pertains to Nova. We need an example for Ceph too. 

 https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html-single/advanced_overcloud_customization/#sect-Customizing_Hieradata_for_Individual_Nodes

Comment 2 John Fulton 2018-08-08 15:14:07 UTC

Created attachment 1474375 [details]
Heat Environment File which uses NodeDataLookup for Ceph deployment

This attachment includes a Heat environment file I used in the scale lab to deal with a server which has one disk missing. 

All of the servers had a devices list with 35 disks except one of them had a disk missing. This environment file overrides the default devices list for only that single node and gives it the list of 34 disks it should use instead the global list.

Comment 3 John Fulton 2018-08-08 15:16:57 UTC

Created attachment 1474376 [details]
Updated Heat Environment File which uses NodeDataLookup for Ceph deployment

Updating attachment as I accidentally attached an old version which was missing the dedicated_devices list.

Comment 4 John Fulton 2018-08-08 15:45:27 UTC

Proposed content:

By default all nodes of a role which will host Ceph OSDs (indicated by the OS::TripleO::Services::CephOSD service in roles_data.yaml), for example CephStorage or ComputeHCI nodes, will use the global devices list and dedicated_devices list set in section 5.1, "Mapping the Ceph Storage Node Disk Layout". This assumes that all of these servers have homogeneous hardware. If a subset of these do not have homogeneous hardware, then it's possible to indicate to director that each of these individual servers should have a different devices and dedicated_devices list. Also known as a "node-specific disk configuration".

To pass director a node-specific disk configuration a Heat environment file, e.g. node-spec-overrides.yaml, must be passed to the `openstack overcloud deploy` command and the file's content must identify each server by a machine unique UUID and a list of local variables which override the global variables.

The machine unique UUID may be extracted for each individual server by running
`dmidecode -s system-uuid` on that server or it may be extracted from the Ironic database by running `openstack baremetal introspection data save NODE-ID | jq .extra.system.product.uuid` on the undercloud.

Warning: If the undercloud.conf does not have inspection_extras = true prior to undercloud installation/upgrade and introspection, then the machine unique UUID will not be in the Ironic database.

Warning: The machine unique UUID is not the Ironic UUID.

A valid node-spec-overrides.yaml file may look like the following:

parameter_defaults:
NodeDataLookup: |
{"32E87B4C-C4A7-418E-865B-191684A6883B": {"devices": ["/dev/sdc"]}}

All lines after the first two lines must be valid JSON. An easy way to verify that the JSON is valid is to use the `jq` command. For example, remove the first two lines ("parameter_defaults:" and "NodeDataLookup: |") from the file temporarily and run `cat node-spec-overrides.yaml | jq .` . As the node-spec-overrides.yaml file grows, `jq` may also be used to ensure that the embedded JSON is valid. For example, because we know the 'devices' and 'dedicated_devices' list should be the same length, we can use the following to verify that they are the same length before starting the deployment.

In the above example, the node-spec-c05-h17-h21-h25-6048r.yaml has three servers in rack c05 in which slots h17, h21, and h25 are missing disks.

A more complicated example is available at https://bugzilla.redhat.com/attachment.cgi?id=1474376

After the JSON has been validated add back the two two lines which makes it a valid environment YAML file ("parameter_defaults:" and "NodeDataLookup: |") and include it with a `-e` in the deployment.

Comment 8 Yogev Rabl 2018-11-13 21:27:30 UTC

Verified

Comment 9 Kim Nylander 2018-11-13 23:23:05 UTC

Content published here: https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html-single/deploying_an_overcloud_with_containerized_red_hat_ceph/#map_disk_layout_non-homogen_ceph