1636508 – [RFE] Add class/pool/crush_rule creation during deployment

Bug 1636508 - [RFE] Add class/pool/crush_rule creation during deployment

Summary: [RFE] Add class/pool/crush_rule creation during deployment

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	Ceph-Ansible
Sub Component:
Version:	3.2
Hardware:	All
OS:	All
Priority:	medium
Severity:	medium
Target Milestone:	z4
Target Release:	3.3
Assignee:	Dimitri Savineau
QA Contact:	Vasishta
Docs Contact:	Ranjini M N
URL:
Whiteboard:
Depends On:
Blocks:	1578730 1726135 1793525 1812927 1822705
TreeView+	depends on / blocked

Reported:	2018-10-05 14:57 UTC by Randy Martinez
Modified:	2023-09-07 19:25 UTC (History)
CC List:	25 users (show)
Fixed In Version:	RHEL: ceph-ansible-3.2.39-1.el7cp Ubuntu: ceph-ansible_3.2.39-2redhat1
Doc Type:	Enhancement
Doc Text:	.The new `device_class` Ansible configuration option With the `device_class`feature, you can alleviate post deployment configuration by updating the `groups_vars/osd.yml` file in the desired layout. This feature offers you multi-backend support by avoiding to comment out sections after deploying Red Hat Ceph Storage.
Clone Of:
Clones:	1812927 (view as bug list)
Environment:
Last Closed:	2020-04-06 08:27:04 UTC
Embargoed:
Dependent Products:
Flags:	rmandyam: needinfo+

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	ceph ceph-ansible pull 4703	'None'	closed	ceph-osd: add device class to crush rules	2021-01-28 17:43:47 UTC
Github	ceph ceph-ansible pull 4743	None	closed	ceph-osd: add device class to crush rules (bp #4703)	2021-01-28 17:43:47 UTC
Red Hat Product Errata	RHBA-2020:1320	None	None	None	2020-04-06 08:27:45 UTC

Description Randy Martinez 2018-10-05 14:57:02 UTC

RFE:

We would like to see ceph-ansible take advantage of device_classes; new in RHCS 3. Once OSDs have been deployed, and classes reported, rulesets should then be created to step_take_class. The last step would now be to create pools with defined rulesets so that multi-backend requires no additional post deployment configuration. This will bring completeness to this new feature w/ceph-ansible, and ultimately enhance the end-user experience.

Comment 3 Randy Martinez 2018-10-06 06:06:21 UTC

= Documentation How-To deploy multi-backend(hdd|ssd) with ceph-ansible

.edit group_vars/osds.yml w/desired layout:
....
create_rbd_pools: true
rbd:
  name: "rbd"
  pg_num: 128
  pgp_num: 128
  rule_name: "replicated_hdd_ruleset"
  type: "replicated"
  device_class: "hdd"
rbd_ssd:
  name: "rbd_ssd"
  pg_num: 128
  pgp_num: 128
  rule_name: "replicated_ssd_ruleset"
  type: "replicated"
  device_class: "ssd"
rbd_osd_erasure:
  name: "rbd_osd_erasure"
  pg_num: 128
  pgp_num: 128
  rule_name: ""
  type: "erasure"
  erasure_profile: ""
  device_class: "hdd"
pools:
  - "{{ rbd }}"
  - "{{ rbd_osd }}"
  - "{{ rbd_osd_erasure }}"

crush_rule_config: true
crush_rule_hdd:
  name: replicated_hdd_ruleset
  root: default
  type: host
  device_class: hdd
  default: false
crush_rule_ssd:
  name: replicated_ssd_ruleset
  root: default
  type: rack
  device_class: ssd
  default: true
crush_rules:
  - "{{ crush_rule_hdd }}"
  - "{{ crush_rule_ssd }}"
create_crush_tree: true
....

.sample inventory file to assign roots:
....
[mons]
osd[4:6]

[osds]
osd1 osd_crush_location="{ 'root': 'default', 'rack': 'rack1', 'host': 'osd1' }"
osd2 osd_crush_location="{ 'root': 'default', 'rack': 'rack1', 'host': 'osd2' }"
osd3 osd_crush_location="{ 'root': 'default', 'rack': 'rack2', 'host': 'osd3' }"
osd4 osd_crush_location="{ 'root': 'default', 'rack': 'rack2', 'host': 'osd4' }"
osd5 devices="['/dev/sda', '/dev/sdb']" osd_crush_location="{ 'root': 'default', 'rack': 'rack3', 'host': 'osd5' }"
osd6 devices="['/dev/sda', '/dev/sdb']" osd_crush_location="{ 'root': 'default', 'rack': 'rack3', 'host': 'osd6' }"

[mgrs]
osd[4:6]
....


.The osd_tree will look like this:
....
TYPE NAME

root default                               
     rack rack1                             
        host osd1                          
             osd.0
             osd.10
        host osd2                          
             osd.3 
             osd.7  
             osd.12 
     rack rack2                             
        host osd3                          
             osd.1
             osd.6 
             osd.11  
        host osd4                          
             osd.4
             osd.9
             osd.13
     rack rack3                             
         host osd5                          
             osd.2
             osd.8
         host osd6                          
             osd.14
             osd.15
....

.Validation of pools
....
# for i in $(rados lspools);do echo "pool: $i"; ceph osd pool get $i crush_rule;done

pool: rbd
crush_rule: replicated_hdd_ruleset
pool: rbd_ssd
crush_rule: replicated_ssd_ruleset
pool: rbd_osd_erasure
crush_rule: erasure-code
....

Comment 6 Giridhar Ramaraju 2019-08-05 13:10:40 UTC

Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. 

Regards,
Giri

Comment 7 Giridhar Ramaraju 2019-08-05 13:11:43 UTC

Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. 

Regards,
Giri

Comment 26 Vasishta 2020-03-27 14:06:33 UTC

Hi Dimitri,

Need your help to understand the scope of verification of this BZ.

1) I don't see any option to create pools as mentioned in summary / Comment 3 , Though it has not come in to discussion in further comments, I don't see anywhere that pool creation is being excluded from scope of this BZ.

2) class - Does it gets created by default from ceph side ?
I tried to create a crush rule for ssd, playbook failed initially as there were no class called ssd was existing (I did not have a ssd device in any node)

3) Even though I had mentioned "osd_scenario="collocated"  osd_crush_location="{ 'rack': 'added_in3x', 'host': 'mag_117' }"" in inventory, OSDs were added to "magna117" host, I'm not sure why osd_crush_location is not properly enforced on nodes.

Ill be attaching complete logs (logs of 2 runs) and other details like inentory file, all.yml, osds.yml, versions.


Please help me to understand the scope of verification of this bz.

Regards,
Vasishta Shastry
QE, Ceph

Comment 30 errata-xmlrpc 2020-04-06 08:27:04 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1320

Note You need to log in before you can comment on or make changes to this bug.

anharris
asafonov
aschoen
ceph-eng-bugs
ceph-qe-bugs
chris.smart
dsavinea
flucifre
gabrioux
gcharot
gfidente
ggrimaux
gmeno
gsitlani
kdreyer
nsatsia
nthomas
pasik
rmandyam
sostapov
sputhenp
tchandra
tpetr
tserlin
vashastr