Bug 1924949
| Summary: | [RFE] Allow setting OSD weight using crush reweight | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Container Storage | Reporter: | Neha Ojha <nojha> |
| Component: | ocs-operator | Assignee: | Shachar Sharon <ssharon> |
| Status: | CLOSED ERRATA | QA Contact: | Shrivaibavi Raghaventhiran <sraghave> |
| Severity: | high | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 4.6 | CC: | bkunal, danken, ebenahar, jarrpa, madam, muagarwa, ocs-bugs, olakra, owasserm, ratamir, rcyriac, sostapov, ssharon |
| Target Milestone: | --- | Keywords: | AutomationBackLog, FutureFeature |
| Target Release: | OCS 4.8.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Enhancement | |
| Doc Text: |
.Ability to provide explicit initial-weight to specific OSD
Customer having a non-balanced cluster can now provide initial weight to an OSD, where some of the OSDs resides on shared-devices or have other physical properties which require fine-grained tuning to their load. When the initial weight is lower than the actual physical capacity, the overall load on that OSD should be lower compared to others in the cluster with the same capacity.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-08-03 18:15:14 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Neha Ojha
2021-02-04 00:18:14 UTC
Asking after internal discussion. This will likely not only involve ocs-operator but also rook work. Raz - can you provide a QE ACK? Acked This is critically important for a strategic customer. Please prioritize for 4.8. This has been done as https://issues.redhat.com/browse/KNIP-1616 @muagarwa Please review the revised doc text and share feedback. LGTM, thanks Test Environment:
-------------------
GS configuration :
-----------------
* Platform - BM
* Replica 2 compression enabled
* Root osd weight 0.167TiB
* Primary affinity for root disks 0
* RBD only enabled
* Total 6 osds in cluster (3 - master root disk, 3 - worker root disk)
Versions:
----------
OCP - 4.8.0-fc.8
OCS - ocs-operator.v4.8.0-450.ci
Observations:
------------------
* Set primary affinity as 0 and init weight as 167GiB on root disks during deployment in storagecluster.yaml for each storageDeviceSets
* Ran IOs, Filled up a cluster till nearfull and one osd full
* The root disk osds are filled less compared to full disk osds,
the proportion of pgs assigned matches the weight as expected
HENCE MOVING THE BZ TO VERIFIED STATE
Console Output:
-----------------
sh-4.4# ceph -s
cluster:
id: 601ba532-40f7-419e-bb30-0b6c995354aa
health: HEALTH_ERR
1 backfillfull osd(s)
1 full osd(s)
1 nearfull osd(s)
1 pool(s) full
1/3 mons down, quorum a,c
services:
mon: 3 daemons, quorum a,c (age 13m), out of quorum: b
mgr: a(active, since 5d)
osd: 6 osds: 6 up (since 29h), 6 in (since 5d)
data:
pools: 1 pools, 256 pgs
objects: 766.29k objects, 2.9 TiB
usage: 5.8 TiB used, 1.7 TiB / 7.5 TiB avail
pgs: 256 active+clean
sh-4.4# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
1 hdd 0.16699 1.00000 335 GiB 151 GiB 150 GiB 28 KiB 1024 MiB 183 GiB 45.23 0.59 13 up
4 hdd 2.18120 1.00000 2.2 TiB 1.9 TiB 1.9 TiB 170 KiB 3.4 GiB 334 GiB 85.06 1.11 164 up
0 hdd 0.16699 1.00000 335 GiB 105 GiB 104 GiB 16 KiB 1024 MiB 230 GiB 31.31 0.41 9 up
3 hdd 2.18120 1.00000 2.2 TiB 1.8 TiB 1.8 TiB 171 KiB 3.2 GiB 355 GiB 84.11 1.09 163 up
5 hdd 2.18120 1.00000 2.2 TiB 1.7 TiB 1.7 TiB 171 KiB 3.0 GiB 452 GiB 79.74 1.04 154 up
2 hdd 0.16699 1.00000 335 GiB 105 GiB 104 GiB 64 KiB 1024 MiB 230 GiB 31.39 0.41 9 up
TOTAL 7.5 TiB 5.8 TiB 5.8 TiB 622 KiB 13 GiB 1.7 TiB 76.85
MIN/MAX VAR: 0.41/1.11 STDDEV: 29.63
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Container Storage 4.8.0 container images bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3003 |