1456036 – [RHCeph 2.3]: default osd crush tunables is set to firefly

Bug 1456036 - [RHCeph 2.3]: default osd crush tunables is set to firefly

Summary: [RHCeph 2.3]: default osd crush tunables is set to firefly

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	RADOS
Sub Component:
Version:	3.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	medium
Target Milestone:	rc
Target Release:	3.0
Assignee:	Josh Durgin
QA Contact:	Vasu Kulkarni
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-05-26 19:26 UTC by Vasu Kulkarni
Modified:	2017-12-05 23:33 UTC (History)
CC List:	15 users (show)
Fixed In Version:	RHEL: ceph-12.2.0-1.el7cp Ubuntu: ceph_12.2.0-2redhat1xenial
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-12-05 23:33:43 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2017:3387	0	normal	SHIPPED_LIVE	Red Hat Ceph Storage 3.0 bug fix and enhancement update	2017-12-06 03:03:45 UTC

Description Vasu Kulkarni 2017-05-26 19:26:30 UTC

Description of problem:

we were having a strange problem when running rgw system test with large number of even sized objects and noticed uneven distribution and usage of disk, (all the disks are of same weight and in size as well), Since we used the defaults crush settings while setting up the cluster i was digging into what other parameters were set wrong and noticed that the crush profile is set to firefly by default instead of optimal.

I believe optimal should be set for fresh installs and for upgrade cases will be tricky and better left for admin. I am not sure if the uneven distribution is just related to firefly profile but as soon as i set this to optimal i saw large number of objects that have been misplaced 318169/515498 objects misplaced (61.721%)

v8650: 1216 pgs: 377 active+recovery_wait+degraded, 110 active+remapped+wait_backfill, 5 active+remapped+backfilling, 724 active+clean; 407 GB data, 1238 GB used, 2774 GB / 4013 GB avail; 25000 kB/s wr, 60 op/s; 5107/515498 objects degraded (0.991%); 318169/515498 objects misplaced (61.721%); 339 MB/s, 88 objects/s recovering

with crush firefly default settings i have seen one of the OSD's fills up soonish and could cause disk full issue when there is room for more or unless resetting crush which could cause lot of moves.

Comment 3 seb 2017-05-29 11:59:11 UTC

We are not changing the crush tunables with ceph-ansible.
Does that mean we should?

I'd think this should reside in Ceph itself.

Comment 4 Ken Dreyer (Red Hat) 2017-05-30 17:41:07 UTC

Sage recently set this to hammer in https://github.com/ceph/ceph/pull/14959, so this will be in Luminous / RHCEPH 3.0.

Comment 5 Sage Weil 2017-05-30 17:52:39 UTC

Hmm, I'd like to set the luminous (3.0) defaults to jewel tunables, actually; that's the last disruptive tunable option we added (chooseleaf_stable) that requires lots of data movement to adjust/fix.  We should confirm that the RHEL kernel has that support backported, though, and probably document which kernel it is.

For jewel downstream it's pretty similar: we don't care so much about old userspace clients connecting to a new cluster, but we do want to make sure RHEL clients can connect.  I'd suggest in this case to change it downstream, though, and not modify upstream jewel this late in its lifecycle.

Comment 7 Ken Dreyer (Red Hat) 2017-07-06 11:29:35 UTC

Sage updated the default tunables again to jewel. https://github.com/ceph/ceph/pull/15370

Ilya, what should we document here for RHCEPH 3.0's RHEL kernel version requirements?

Comment 11 Vasu Kulkarni 2017-09-20 16:49:44 UTC

Default values:

[cephuser@ceph-jenkins-build-run236-node8-rgw ~]$ sudo ceph osd crush show-tunables
{
    "choose_local_tries": 0,
    "choose_local_fallback_tries": 0,
    "choose_total_tries": 50,
    "chooseleaf_descend_once": 1,
    "chooseleaf_vary_r": 1,
    "chooseleaf_stable": 1,
    "straw_calc_version": 1,
    "allowed_bucket_algs": 54,
    "profile": "jewel",
    "optimal_tunables": 1,
    "legacy_tunables": 0,
    "minimum_required_version": "jewel",
    "require_feature_tunables": 1,
    "require_feature_tunables2": 1,
    "has_v2_rules": 0,
    "require_feature_tunables3": 1,
    "has_v3_rules": 0,
    "has_v4_buckets": 1,
    "require_feature_tunables5": 1,
    "has_v5_rules": 0
}

[cephuser@ceph-jenkins-build-run236-node8-rgw ~]$ sudo ceph --version
ceph version 12.2.0-2.el7cp (3137b4f525c5dcc2a34fef5b0f6bcf4477312db9) luminous (rc)

Comment 14 errata-xmlrpc 2017-12-05 23:33:43 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3387

Note You need to log in before you can comment on or make changes to this bug.