Bug 1657883 - change default partition size for db when using dedicated scenario
Summary: change default partition size for db when using dedicated scenario
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 3.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 3.3
Assignee: Guillaume Abrioux
QA Contact: Yogesh Mane
URL:
Whiteboard:
Depends On:
Blocks: 1629656 1641792
TreeView+ depends on / blocked
 
Reported: 2018-12-10 16:26 UTC by Vasu Kulkarni
Modified: 2019-08-15 15:06 UTC (History)
21 users (show)

Fixed In Version: RHEL: ceph-ansible-3.2.17-1.el7cp Ubuntu: ceph-ansible_3.2.17-2redhat1
Doc Type: Known Issue
Doc Text:
.When using dedicated devices for BlueStore the default sizes for _block.db_ and _block.wal_ might be too small By default `ceph-ansible` does not override the default values `bluestore block db size` and `bluestore block wal size`. The default sizes are 1 GB and 576 MB respectively. These sizes might be too small when using dedicated devices with BlueStore. To work around this issue, set `bluestore_block_db_size` or `bluestore_block_wal_size`, or both, using `ceph_conf_overrides` in `ceph.conf` to override the default values.
Clone Of:
Environment:
Last Closed: 2019-07-09 10:54:31 UTC
Embargoed:


Attachments (Terms of Use)

Description Vasu Kulkarni 2018-12-10 16:26:29 UTC
Description of problem:

When configuring default db size using dedicated scenario, only 1GB is used for DB size which could be insufficient, Ideally it should use >4% percentage of OSD size as defined in document


nvme0n1      259:1    0 745.2G  0 disk  
├─nvme0n1p1  259:2    0     1G  0 part  
├─nvme0n1p2  259:3    0   576M  0 part 

Also the name dedicated scenario looks more of a filestore setting, we should change this to db and wal as we specify in ceph-volume( I can raise a separate bz for this if required)

Comment 3 John Harrigan 2018-12-10 18:48:20 UTC
The group_vars/all.yml file contains this

  all.yml:osd_objectstore: bluestore
  all.yml:#    bluestore block db size: 14336000000
  all.yml:#    bluestore block wal size: 2048000000

so I would have expected these values (14GB and 2GB) to be used.
The ceph.conf file on the deployed cluster OSD nodes contain no additional
information on block db or block wal size.

The cluster was deployed using these software versions:
RHEL 7.6
ansible 2.6.7
ceph version 12.2.8-49.el7cp
ceph-ansible.noarch 3.2.0-0.1.rc8.el7cp

Comment 4 John Harrigan 2018-12-10 19:36:25 UTC
(In reply to John Harrigan from comment #3)

> The group_vars/all.yml file contains this
> 
>   all.yml:osd_objectstore: bluestore
>   all.yml:#    bluestore block db size: 14336000000
>   all.yml:#    bluestore block wal size: 2048000000
> 
> so I would have expected these values (14GB and 2GB) to be used.
> The ceph.conf file on the deployed cluster OSD nodes contain no additional
> information on block db or block wal size.
> 
> The cluster was deployed using these software versions:
> RHEL 7.6
> ansible 2.6.7
> ceph version 12.2.8-49.el7cp
> ceph-ansible.noarch 3.2.0-0.1.rc8.el7cp

My bad.
The excerpt above from 'all.yml' was edited on this cluster. It is not reflective of
the contents of all.yml.sample

The issue does however still remain that the default partition sizes for DB and WAL (576MB)
partition sizes are small.

Comment 5 seb 2018-12-11 08:52:26 UTC
Send the whole group_vars.
This BZ definitely lacks info.

Again, must required info are group_vars/inventory files/ansible play logs

Why do we have to ask for these each time again and again?
I guess at some point we will stop responding to BZ with no info like this one.

Comment 6 Ram Raja 2018-12-11 16:41:44 UTC
(In reply to Vasu Kulkarni from comment #0)
> Description of problem:
> 
> When configuring default db size using dedicated scenario, 

Dedicated scenario? I guess you're talking about 'osd_scenario: non-collocated'
and 'osd_objectstore: bluestore' in this BZ

> only 1GB is used
> for DB size which could be insufficient, Ideally it should use >4%
> percentage of OSD size as defined in document

Which document? Can you please share the link to the doc and mention the
exact section of the doc?

> 
> 
> nvme0n1      259:1    0 745.2G  0 disk  
> ├─nvme0n1p1  259:2    0     1G  0 part  
> ├─nvme0n1p2  259:3    0   576M  0 part 
> 
> Also the name dedicated scenario looks more of a filestore setting, we
> should change this to db and wal as we specify in ceph-volume( I can raise a
> separate bz for this if required)

I don't follow here. Which ceph-ansible option do you want to rename? Can you be
explicit?

Also please provide the details requested in Comment 5.

Comment 8 Vasu Kulkarni 2018-12-11 17:50:00 UTC
Sorry for not providing all the info here, I was just talking about the default values that get set when using non-collocated scenario's 

osd-objectstore: bluestore
osd_scenario: non-collocated
osd_objectstore: bluestore
dmcrypt: true
devices:
  - /dev/sdb
  - /dev/sdc
dedicated_devices:
  - /dev/nvme0n1
  - /dev/nvme0n1


the partitions which get created by default are 1gb as shown below

nvme0n1      259:1    0 745.2G  0 disk  
├─nvme0n1p1  259:2    0     1G  0 part  
├─nvme0n1p2  259:3    0   576M  0 part 


what we are asking is - lets not default to 1GB for each dedicated db/wal and use more of nvme space per osd.

I will clear neadinfo when I update the group vars here.

Comment 9 Vasu Kulkarni 2018-12-11 18:09:37 UTC
Here is OSD.yml , I hope the below config helps, we are just talking about the default value that gets set for DB/WAL - maybe somewhere here https://github.com/ceph/ceph-ansible/blob/master/roles/ceph-osd/tasks/scenarios/non-collocated.yml#L74

osd_scenario: non-collocated
devices:
  - /dev/sdb
  - /dev/sdc
  - /dev/sdd
  - /dev/sde
  - /dev/sdf
  - /dev/sdg
  - /dev/sdh
  - /dev/sdi
  - /dev/sdj
  - /dev/sdk
  - /dev/sdl
  - /dev/sdm
  - /dev/sdn
  - /dev/sdo
  - /dev/sdp
  - /dev/sdq
  - /dev/sdr
  - /dev/sds
  - /dev/sdt
  - /dev/sdu
  - /dev/sdv
  - /dev/sdw
  - /dev/sdx
  - /dev/sdy
  - /dev/sdz
dedicated_devices:
  - /dev/nvme0n1
  - /dev/nvme0n1
  - /dev/nvme0n1
  - /dev/nvme0n1
  - /dev/nvme0n1
  - /dev/nvme0n1
  - /dev/nvme0n1
  - /dev/nvme0n1
  - /dev/nvme0n1
  - /dev/nvme0n1
  - /dev/nvme0n1
  - /dev/nvme0n1
  - /dev/nvme1n1
  - /dev/nvme1n1
  - /dev/nvme1n1
  - /dev/nvme1n1
  - /dev/nvme1n1
  - /dev/nvme1n1
  - /dev/nvme1n1
  - /dev/nvme1n1
  - /dev/nvme1n1
  - /dev/nvme1n1
  - /dev/nvme1n1
  - /dev/nvme1n1
  - /dev/nvme1n1


all.yml 
----
fetch_directory: ~/ceph-ansible-keys
ceph_origin: distro 
ceph_repository: rhcs
ceph_stable: true
ceph_stable_release: luminous
ceph_stable_rh_storage: true
upgrade_ceph_packages: True
ceph_rhcs_version: 3
journal_size: 1024
monitor_address_block: 172.16.0.0/16
osd_auto_discovery: false
public_network: 172.16.0.0/16
cluster_network: 172.17.0.0/16


Inventory file is bunch of  [osds] [rgws] etc nothing special there

Comment 10 Guillaume Abrioux 2018-12-11 19:12:57 UTC
@Vasu

ceph-disk prepare subcommand in the CLI doesn't allow to specify the block.db/block.wal sizes.
The only way is to set `bluestore block db size` and/or `bluestore block wal size` in ceph.conf with the `ceph_conf_overrides` in ceph-ansible.

Comment 11 Vasu Kulkarni 2018-12-11 19:33:30 UTC
@Guillaume

So what the bz is asking is to set the default values to something useful instead of default 1gb, a 50GB makes more sense if we dont want to use the >4% rule.

Comment 12 John Brier 2018-12-11 19:49:17 UTC
A request to add this to the Release Notes has been made. Please fill out the Doc Text using the cause, consequence, workaround, result format and we will ensure it gets in the Release Notes.

Comment 13 Ram Raja 2018-12-12 14:35:36 UTC
(In reply to Vasu Kulkarni from comment #11)
> @Guillaume
> 
> So what the bz is asking is to set the default values to something useful
> instead of default 1gb, a 50GB makes more sense if we dont want to use the
> >4% rule.

ceph-disk sets
* the default size for bluestore's block DB as 1GB
  See this commit,
  https://github.com/ceph/ceph/commit/2a5cd5dc1e17eef0

* the default size of bluestore's block WAL as 576MB
  See this commit,
  https://github.com/ceph/ceph/commit/7a5051af2f5fcc15ddcef348c

I can see that upstream documentation says this about block DB size requirement,
"It is recommended that the block.db size isn’t smaller than 4% of block. For example,
if the block size is 1TB, then block.db shouldn’t be less than 40GB."
http://docs.ceph.com/docs/luminous/rados/configuration/bluestore-config-ref/#sizing

How did you come up with a minimum 50GB for block DB size?

Is there a minimum size requirement for block WAL?

Are the minimum size requirements for block DB and block WAL mentioned in the
downstream documentation? If not, we should do that. And we need to specify how
to set the block DB and block WAL sizes through ceph-ansible (using
ceph_conf_overrides variable), when deploying bluestore OSD devices with ceph-disk.

Comment 16 Sébastien Han 2018-12-12 16:40:14 UTC
Drew, I'm confused, is 3.3 a thing? I thought we will jump from 3.2 to 4.0?
Am I missing something?

Comment 17 John Brier 2018-12-12 17:05:55 UTC
(In reply to Ram Raja from comment #13)
> Are the minimum size requirements for block DB and block WAL mentioned in the
> downstream documentation? If not, we should do that. And we need to specify
> how
> to set the block DB and block WAL sizes through ceph-ansible (using
> ceph_conf_overrides variable), when deploying bluestore OSD devices with
> ceph-disk.

I searched 'site:redhat.com' for `bluestore_block_db_size` and/or `bluestore_block_wal_size` and only got bugzilla hits, no doc hits.

We describe the WAL and DB devices in this Bluestore section:

https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/administration_guide/#bluestore-devices


No sizing recommendations though.

Comment 18 Vasu Kulkarni 2018-12-12 17:36:32 UTC
(In reply to Ram Raja from comment #13)
> (In reply to Vasu Kulkarni from comment #11)
> > @Guillaume
> > 
> > So what the bz is asking is to set the default values to something useful
> > instead of default 1gb, a 50GB makes more sense if we dont want to use the
> > >4% rule.
> 
> ceph-disk sets
> * the default size for bluestore's block DB as 1GB
>   See this commit,
>   https://github.com/ceph/ceph/commit/2a5cd5dc1e17eef0
> 
> * the default size of bluestore's block WAL as 576MB
>   See this commit,
>   https://github.com/ceph/ceph/commit/7a5051af2f5fcc15ddcef348c
> 
> I can see that upstream documentation says this about block DB size
> requirement,
> "It is recommended that the block.db size isn’t smaller than 4% of block.
> For example,
> if the block size is 1TB, then block.db shouldn’t be less than 40GB."
> http://docs.ceph.com/docs/luminous/rados/configuration/bluestore-config-ref/
> #sizing
> 
> How did you come up with a minimum 50GB for block DB size?
I was just saying instead of 1GB default value right now, it could be 50G which can help a bit, but the ideal situation
is to use a good calculation

> 
> Is there a minimum size requirement for block WAL?
Neha/Josh should help you here.

> 
> Are the minimum size requirements for block DB and block WAL mentioned in the
> downstream documentation? If not, we should do that. And we need to specify
> how
> to set the block DB and block WAL sizes through ceph-ansible (using
> ceph_conf_overrides variable), when deploying bluestore OSD devices with
> ceph-disk.

Based on other replies, I think its not worth fixing + testing cycle here, once the docs is
updated I would rather close this as fixed and focus my energy on better things :)

Also marking this as low priority.

Comment 19 John Brier 2018-12-12 18:07:17 UTC
If you want the docs updated please open a BZ on component Documentation and provide the minimum requirements/recommendations you want added and/or any calculations you want added.

Comment 20 Vasu Kulkarni 2018-12-12 18:21:37 UTC
@john,

For doc text you will have to contact Josh or Neha, they can tell the right values to use. Since I made this a docs bz, I am bumping up priority.

Comment 21 John Harrigan 2018-12-12 18:36:29 UTC
The doc changes discussed here should be coordinated with this related BZ

https://bugzilla.redhat.com/show_bug.cgi?id=1622597

Comment 22 Ram Raja 2018-12-13 10:21:29 UTC
Josh,

When ceph-ansible deploys bluestore OSDs using ceph disk, where block DB and WAL partitions
are created on a dedicated device [1], default partitions of 1GB [2]  and 576MB [3] are created
for DB and WAL respectively. The default DB size partition does not look ideal. As per documentation
it's recommended that it at least be 4% of block size [2]. Is there a minimum recommendation for
WAL size?

Once we know the minimum requirements, we can document how we can set these DB and WAL partition sizes
via ceph-ansible.


[1] http://docs.ceph.com/ceph-ansible/stable-3.2/osds/scenarios.html#non-collocated
default

[2] https://github.com/ceph/ceph/commit/2a5cd5dc1e17eef0

[3] https://github.com/ceph/ceph/commit/7a5051af2f5fcc15ddcef348c

[4] http://docs.ceph.com/docs/luminous/rados/configuration/bluestore-config-ref/#sizing

Comment 23 Josh Durgin 2018-12-13 18:33:20 UTC
(In reply to Ram Raja from comment #22)
> Josh,
> 
> When ceph-ansible deploys bluestore OSDs using ceph disk, where block DB and
> WAL partitions
> are created on a dedicated device [1], default partitions of 1GB [2]  and
> 576MB [3] are created
> for DB and WAL respectively. The default DB size partition does not look
> ideal. As per documentation
> it's recommended that it at least be 4% of block size [2]. Is there a
> minimum recommendation for
> WAL size?
> 
> Once we know the minimum requirements, we can document how we can set these
> DB and WAL partition sizes
> via ceph-ansible.
> 
> 
> [1]
> http://docs.ceph.com/ceph-ansible/stable-3.2/osds/scenarios.html#non-
> collocated
> default
> 
> [2] https://github.com/ceph/ceph/commit/2a5cd5dc1e17eef0
> 
> [3] https://github.com/ceph/ceph/commit/7a5051af2f5fcc15ddcef348c
> 
> [4]
> http://docs.ceph.com/docs/luminous/rados/configuration/bluestore-config-ref/
> #sizing

WAL is included in the DB if there is no separate partition - this is what we recommend for 2 device type deployments (3 device types are much less common).

So for the common case of HDD + SSD, the sizing guidelines for the DB partition is all we need.

If a deployment does have an even faster (e.g. persistent memory) device for WAL, then a 3rd partition makes sense, and sizing guidelines for the WAL are similar to filestore journals - there is no minimum, it depends on the throughput of the slower devices how much space is appropriate. Generally 5-10GB is plenty.


Note You need to log in before you can comment on or make changes to this bug.