Bug 1903973
| Summary: | [Azure][ROKS] Set SSD tuning (tuneFastDeviceClass) as default for OSD devices in Azure/ROKS platform | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Container Storage | Reporter: | Sahina Bose <sabose> | |
| Component: | ocs-operator | Assignee: | Pulkit Kundra <pkundra> | |
| Status: | CLOSED ERRATA | QA Contact: | Yuli Persky <ypersky> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 4.6 | CC: | ebenahar, madam, mbukatov, muagarwa, nberry, ocs-bugs, owasserm, pkundra, ratamir, shberry, sostapov | |
| Target Milestone: | --- | Keywords: | AutomationBackLog | |
| Target Release: | OCS 4.7.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | 4.7.0-701.ci | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1909793 (view as bug list) | Environment: | ||
| Last Closed: | 2021-05-19 09:16:33 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1909793, 1925004 | |||
|
Description
Sahina Bose
2020-12-03 09:23:28 UTC
Pulkit, is this addressed by https://github.com/openshift/ocs-operator/pull/955 ? On the 4.6.2 OCS azure cluster:
(yulidir) [ypersky@qpas ocs-ci]$ oc rsh rook-ceph-tools-6fdd868f75-259zs
sh-4.4# ceph config dump
WHO MASK LEVEL OPTION VALUE RO
global basic log_file *
global advanced mon_allow_pool_delete true
global advanced mon_cluster_log_file
global advanced mon_pg_warn_min_per_osd 0
global advanced osd_pool_default_pg_autoscale_mode on
global advanced rbd_default_features 3
mgr advanced mgr/balancer/active true
mgr advanced mgr/balancer/mode upmap
mgr. advanced mgr/prometheus/rbd_stats_pools ocs-storagecluster-cephblockpool *
mgr.a advanced mgr/dashboard/a/server_addr 10.128.2.11 *
mgr.a advanced mgr/prometheus/a/server_addr 10.128.2.11 *
osd dev bluestore_cache_size 3221225472
osd advanced bluestore_compression_max_blob_size 65536
osd advanced bluestore_compression_min_blob_size 8192
osd advanced bluestore_deferred_batch_ops 16
osd dev bluestore_max_blob_size 65536
osd advanced bluestore_min_alloc_size 4000 *
osd advanced bluestore_prefer_deferred_size 0
osd advanced bluestore_throttle_cost_per_io 4000
osd advanced osd_delete_sleep 0.000000
osd advanced osd_op_num_shards 8 *
osd advanced osd_op_num_threads_per_shard 2 *
osd advanced osd_recovery_sleep 0.000000
osd advanced osd_snap_trim_sleep 0.000000
mds.ocs-storagecluster-cephfilesystem-a basic mds_cache_memory_limit 4294967296
mds.ocs-storagecluster-cephfilesystem-b basic mds_cache_memory_limit 4294967296
sh-4.4#
It looks like the fix is not applied in 4.7 ( please correct me if I am wrong). I've did exactly the same as in comment 8 and did not get the same output. To be more detailed: 1) I've used Azure cluster musoni-30 ( https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/1646/ ) 2) The ocs version : (myenv) [ypersky@ypersky auth]$ oc -n openshift-storage get csv NAME DISPLAY VERSION REPLACES PHASE ocs-operator.v4.7.0-324.ci OpenShift Container Storage 4.7.0-324.ci Succeeded (myenv) [ypersky@ypersky auth]$ 3) (myenv) [ypersky@ypersky auth]$ oc rsh rook-ceph-tools-76bc89666b-xtdj5 sh-4.4# sh-4.4# ceph config dump WHO MASK LEVEL OPTION VALUE RO global basic log_file /var/log/ceph/$cluster-$name.log * global basic log_to_file true global advanced mon_allow_pool_delete true global advanced mon_cluster_log_file global advanced mon_pg_warn_min_per_osd 0 global advanced osd_pool_default_pg_autoscale_mode on global advanced osd_scrub_auto_repair true global advanced rbd_default_features 3 mgr advanced mgr/balancer/active true mgr advanced mgr/balancer/mode upmap mgr. advanced mgr/prometheus/rbd_stats_pools ocs-storagecluster-cephblockpool * mgr.a advanced mgr/prometheus/a/server_addr 10.131.0.38 * mds.ocs-storagecluster-cephfilesystem-a basic mds_cache_memory_limit 4294967296 mds.ocs-storagecluster-cephfilesystem-b basic mds_cache_memory_limit 4294967296 sh-4.4# My conclusion: Since the following tunings osd dev bluestore_cache_size 3221225472 osd advanced bluestore_compression_max_blob_size 65536 osd advanced bluestore_compression_min_blob_size 8192 osd advanced bluestore_deferred_batch_ops 16 osd dev bluestore_max_blob_size 65536 osd advanced bluestore_min_alloc_size 4000 * osd advanced bluestore_prefer_deferred_size 0 osd advanced bluestore_throttle_cost_per_io 4000 do not appear in the "ceph config dump" output,this version (4.7.0-324.ci) does not include the fix. => reopening the bug and changing the status to Assigned. @Pulkit Kundra, I've verified with 3). 1) run oc rsh rook-ceph-tools-76bc89666b-xtdj5 2) Run each one of the following commands : ceph config show osd.0 ceph config show osd.1 ceph config show osd.2 and in each one of the outputs the following parameters appeared: sh-4.4# ceph config show osd.0 NAME VALUE SOURCE OVERRIDES IGNORES bluestore_cache_size 3221225472 cmdline bluestore_compression_max_blob_size 65536 cmdline bluestore_compression_min_blob_size 8912 cmdline bluestore_deferred_batch_ops 16 cmdline bluestore_max_blob_size 65536 cmdline bluestore_min_alloc_size 4096 cmdline bluestore_prefer_deferred_size 0 cmdline bluestore_throttle_cost_per_io 4000 cmdline sh-4.4# ceph config show osd.1 NAME VALUE SOURCE OVERRIDES IGNORES bluestore_cache_size 3221225472 cmdline bluestore_compression_max_blob_size 65536 cmdline bluestore_compression_min_blob_size 8912 cmdline bluestore_deferred_batch_ops 16 cmdline bluestore_max_blob_size 65536 cmdline bluestore_min_alloc_size 4096 cmdline bluestore_prefer_deferred_size 0 cmdline bluestore_throttle_cost_per_io 4000 cmdline crush_location root=default host=ocs-deviceset-1-data-05flj5 region=eastus zone=eastus-2 cmdline sh-4.4# ceph config show osd.2 NAME VALUE SOURCE OVERRIDES IGNORES bluestore_cache_size 3221225472 cmdline bluestore_compression_max_blob_size 65536 cmdline bluestore_compression_min_blob_size 8912 cmdline bluestore_deferred_batch_ops 16 cmdline bluestore_max_blob_size 65536 cmdline bluestore_min_alloc_size 4096 cmdline bluestore_prefer_deferred_size 0 cmdline bluestore_throttle_cost_per_io 4000 cmdline => Changing the bug status to "verified". Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2041 |