Bug 1715577 - [Consulting] Ceph Balancer not working with EC/upmap configuration
Summary: [Consulting] Ceph Balancer not working with EC/upmap configuration
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 3.2
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: 3.3
Assignee: Neha Ojha
QA Contact: Manohar Murthy
Aron Gunn
URL:
Whiteboard:
: 1721231 (view as bug list)
Depends On:
Blocks: 1725227 1726135
TreeView+ depends on / blocked
 
Reported: 2019-05-30 17:03 UTC by tbrekke
Modified: 2019-08-21 15:11 UTC (History)
17 users (show)

Fixed In Version: RHEL: ceph-12.2.12-39.el7cp Ubuntu: ceph_12.2.12-36redhat1xenial
Doc Type: Bug Fix
Doc Text:
.The Ceph Balancer now works with erasure-coded pools The `maybe_remove_pg_upmaps` method is meant to cancel invalid placement group items done by the `upmap` balancer, but this method incorrectly canceled valid placement group items when using erasure-coded pools. This caused a utilization imbalance on the OSDs. With this release, the `maybe_remove_pg_upmaps` method is less aggressive and does not invalidate valid placement group items, and as a result, the `upmap` balancer works with erasure-coded pools.
Clone Of:
Environment:
Last Closed: 2019-08-21 15:11:09 UTC
Embargoed:


Attachments (Terms of Use)
osd_dump (428.83 KB, text/plain)
2019-05-30 17:03 UTC, tbrekke
no flags Details
osd_map (707.86 KB, application/octet-stream)
2019-05-30 17:04 UTC, tbrekke
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 4267482 0 Troubleshoot None Ceph - Mgr balancer upmap mode does not work with EC pools 2019-07-05 23:18:08 UTC
Red Hat Product Errata RHSA-2019:2538 0 None None None 2019-08-21 15:11:26 UTC

Internal Links: 1734583

Description tbrekke 2019-05-30 17:03:45 UTC
Created attachment 1575307 [details]
osd_dump

Description of problem:

After 356 pg changes from the balancer, no more changes are being applied, even though the cluster is no where near balanced.

MIN/MAX VAR: 0.68/1.35  STDDEV: 4.93

Even when trying to run a plan manually, the changes do not get applied.

$ ceph balancer eval
current cluster score 0.029978 (lower is better)
$ ceph balancer optimize test
$ ceph balancer show test
# starting osdmap epoch 304384
# starting crush version 2999
# mode upmap
ceph osd pg-upmap-items 462.17 1568 1569 2477 2482 2190 2199 1071 1085 32 37
ceph osd pg-upmap-items 462.50 2260 2268 2293 2279 2312 2313 1475 1465 1661 1664 1318 1313 1154 1174 1971 1983 1957 1951
ceph osd pg-upmap-items 462.60 1345 1348 1297 1289 1955 1951 2389 2397 2312 2311 102 101
ceph osd pg-upmap-items 462.b4 1422 1430 1794 1801 2072 2062 1759 1751 2312 2313 2215 2216 2367 2385 1367 1360
ceph osd pg-upmap-items 462.e4 1378 1376 2022 2026 2495 2482 1635 1627 2312 2307
ceph osd pg-upmap-items 462.fb 2252 2240 2312 2311 1328 1308 1584 1580 1524 1509
ceph osd pg-upmap-items 462.128 1825 1828 2175 2182 2135 2136 1528 1536 1560 1569 1373 1360
ceph osd pg-upmap-items 462.156 2135 2136 32 37 1064 1055 1743 1730 1713 1721 1991 1999
ceph osd pg-upmap-items 462.167 1657 1658 1581 1577 2145 2151 2135 2134
ceph osd pg-upmap-items 462.172 2176 2182 2135 2136 1202 1210
$ ceph balancer execute test

No change applied

Comment 1 tbrekke 2019-05-30 17:04:56 UTC
Created attachment 1575308 [details]
osd_map

omap added

Comment 14 Boris Ranto 2019-06-18 12:16:07 UTC
*** Bug 1721231 has been marked as a duplicate of this bug. ***

Comment 42 errata-xmlrpc 2019-08-21 15:11:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2538


Note You need to log in before you can comment on or make changes to this bug.