Bug 1355723

Summary:

Pools list dashboard page provides incorrect storage utilization/capacity data

Product:

[Red Hat Storage] Red Hat Storage Console

Reporter:

Martin Bukatovic <mbukatov>

Component:

Ceph

Assignee:

anmol babu <anbabu>

Ceph sub component:

configuration

QA Contact:

Martin Kudlej <mkudlej>

Status:

CLOSED EOL

Docs Contact:

Severity:

high

Priority:

unspecified

CC:

anbabu, mkudlej, nthomas, rghatvis

Version:

Target Milestone:

---

Target Release:

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

rhscon-ceph-0.0.38-1.el7scon.x86_64

Doc Type:

Known Issue

Doc Text:

Pools list in Console displays incorrect storage utilization and capacity data Pool utilization values are not calculated by Ceph appropriately if there are multiple CRUSH hierarchies. As a result of this: * Pool utilization values on the dashboard, clusters view and pool listing page are displayed incorrectly. * No alerts will be sent if the actual pool utilization surpasses the configured thresholds * False alerts might be generated for pool utilization This issue occurs only when the user creates multiple storage profiles for a cluster, which in turn creates multiple CRUSH hierarchies. To avoid this problem, include all the OSDs in a single storage profile.

Story Points:

---

Clone Of:

Environment:

Last Closed:

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

1360230

Bug Blocks:

1346350

Attachments:

Description	Flags
screenshot 1: pools list before cluster sync	none
screenshot 2: pools list after cluster sync	none
screenshot 3: pool unitization event detail page	none
build 39 list of pools	none
build 39 - main dashboard	none
build 39 - cluster dashboard	none
build 39 - cluster list	none

Description Martin Bukatovic 2016-07-12 10:52:02 UTC

Description of problem
======================

Pools page which provides a list of pools shows information about maximal
available storage capacity for each pool. But this information is valid only
when there is no problem with the cluster.

Eg. when (some) OSD are removed from cluster map (eg. in case of some problem),
the pool item in the Pools list provides incorrect storage capacity
information.

Version-Release
===============

On RHSC 2.0 server:

rhscon-ceph-0.0.27-1.el7scon.x86_64
rhscon-core-0.0.28-1.el7scon.x86_64
rhscon-core-selinux-0.0.28-1.el7scon.noarch
rhscon-ui-0.0.42-1.el7scon.noarch
ceph-ansible-1.0.5-23.el7scon.noarch
ceph-installer-1.0.12-3.el7scon.noarch

On Ceph Storage nodes:

rhscon-agent-0.0.13-1.el7scon.noarch
ceph-osd-10.2.2-5.el7cp.x86_64

How reproducible
================

100 %

Steps to Reproduce
==================

1. Install RHSC 2.0 following the documentation.

2. Accept few nodes for the ceph cluster.

3. Create new ceph cluster named 'alpha'.

4. Create 2 RBD (along with new backing pool each time) in the cluster.

5. Check CRUSH cluster map, make sure it's ok and then make a backup of it:

~~~
# ceph --cluster alpha osd getcrushmap -o ceph-crushmap.ok.compiled
# crushtool -d ceph-crushmap.ok.compiled -o ceph-crushmap.ok
~~~

7. Edit CRUSH cluster map so that there are no OSDs in cluster hierarchy
   which is used by the backing pools of RBDs created in step 4:

~~~
# cp ceph-crushmap.ok ceph-crushmap.err01
# sed -i '/.*item\ osd\.[0-9]\+\ weight\ [0-9\.]\+$/d' ceph-crushmap.err01
$ sed -i 's/weight\ [0-9\.]\+$/weight 0.000/' ceph-crushmap.err01
~~~

So that for example:

~~~
# diff ceph-crushmap.ok ceph-crushmap.err01
65c65
< 	# weight 0.010
---
> 	# weight 0.000
68d67
< 	item osd.1 weight 0.010
72c71
< 	# weight 0.010
---
> 	# weight 0.000
75d73
< 	item osd.2 weight 0.010
79c77
< 	# weight 0.010
---
> 	# weight 0.000
82d79
< 	item osd.0 weight 0.010
86c83
< 	# weight 0.010
---
> 	# weight 0.000
89d85
< 	item osd.3 weight 0.010
93c89
< 	# weight 0.040
---
> 	# weight 0.000
96,99c92,95
< 	item mbukatov-usm1-node2.os1.phx2.redhat.com-general weight 0.010
< 	item mbukatov-usm1-node3.os1.phx2.redhat.com-general weight 0.010
< 	item mbukatov-usm1-node1.os1.phx2.redhat.com-general weight 0.010
< 	item mbukatov-usm1-node4.os1.phx2.redhat.com-general weight 0.010
---
> 	item mbukatov-usm1-node2.os1.phx2.redhat.com-general weight 0.000
> 	item mbukatov-usm1-node3.os1.phx2.redhat.com-general weight 0.000
> 	item mbukatov-usm1-node1.os1.phx2.redhat.com-general weight 0.000
> 	item mbukatov-usm1-node4.os1.phx2.redhat.com-general weight 0.000
~~~

8. Compile new broken CRUSH map and push it into the cluster:

~~~
# crushtool -c ceph-crushmap.err01 -o ceph-crushmap.err01.compiled
# ceph --cluster alpha osd setcrushmap -i  ceph-crushmap.err01.compiled
~~~

At this point, `ceph --cluster alpha status` should report something like:

 * health HEALTH_WARN
 * recovery ... objects misplaced (100.000%)

9. Check output of `ceph df` command. Since you haven't loaded any data
   into the RBDs, ceph should report that both pools are empty (0 for
   %USED and 0 for MAX AVAIL) .

10. Make sure the sync of cluster state happened before you go on.

You can either wait for RHSC 2.0 UI to sync new cluster state
automatically. But it means waiting for at least 24 hours (which is
the current default value of clustersSyncInterval).

Or you can force the sync by restarting skyring service:

~~~
systemctl restart skyring
~~~

11. Check Pools list page.

Actual results
==============

While `ceph df` provides this information:

~~~
# ceph --cluster alpha df 
GLOBAL:
    SIZE       AVAIL      RAW USED     %RAW USED 
    40915M     40760M         155M          0.38 
POOLS:
    NAME         ID     USED     %USED     MAX AVAIL     OBJECTS 
    rbd_pool     1      7486         0             0          16 
    def_pool     3       114         0             0           4
~~~

The *Pools* page shown different storage capacity data for both pools:

 * rbd_pool: 100.0%, 7.3 KB of 7.3 KB used 
 * def_pool: 100.0%, 114.0 B of 114.0 B used 

(see screenshot #2)

When we ignore the wrong units there for now (there is another BZ for that,
see https://bugzilla.redhat.com/show_bug.cgi?id=1340747#c4), the data presented
there doesn't match data reported by `ceph df`:

 * '%USED' should not be reported as 100% by UI (ceph df stil shows 0%,
   moreover both pools are essentially empty).
 * MAX AVAIL is not shown in the pool list by UI, and the 
   `7.3 KB of 7.3 KB used` statement is misleading in this case

The related event states:

> Pool utilization for pool rbd_pool on alpha cluster has moved to CRITICAL

which while technically true, it doesn't cover the actual issue in any way
and admin is again forced to use ceph command line tools to debug the problem.

Expected results
================

The storage capacity information should not conflict with information as
provided by `ceph df` command.

Statement "7.3 KB of 7.3 KB used" would be better replaced by presenting
'USED' and 'MAX AVAIL' values instead.

Related events should better describe the problem. At least information about
zero value of 'MAX AVAIL' should be conveyed.

Additional info
===============

Besides the problem with interpreting the storage capacity data, the sync 
interval itself makes things harder to understand as well:

 * See screenshot #1 which I created after I did all steps from "Steps to
   Reproduce" section but without restarting skyring, so that the cluster
   state is still not synced.
 * And compare that with screenshot #2 which shows Pools list page after the
   skyring restart (which triggered the sync).

Comment 1 Martin Bukatovic 2016-07-12 10:54:18 UTC

Created attachment 1178853 [details]
screenshot 1: pools list before cluster sync

Comment 2 Martin Bukatovic 2016-07-12 10:59:47 UTC

Created attachment 1178856 [details]
screenshot 2: pools list after cluster sync

Comment 3 Martin Bukatovic 2016-07-12 11:01:11 UTC

Created attachment 1178859 [details]
screenshot 3: pool unitization event detail page

Comment 4 Nishanth Thomas 2016-07-24 14:17:02 UTC

(In reply to Martin Bukatovic from comment #0)
> Description of problem
> ======================
> 
> Pools page which provides a list of pools shows information about maximal
> available storage capacity for each pool. But this information is valid only
> when there is no problem with the cluster.
> 
> Eg. when (some) OSD are removed from cluster map (eg. in case of some
> problem),
> the pool item in the Pools list provides incorrect storage capacity
> information.
> 
> Version-Release
> ===============
> 
> On RHSC 2.0 server:
> 
> rhscon-ceph-0.0.27-1.el7scon.x86_64
> rhscon-core-0.0.28-1.el7scon.x86_64
> rhscon-core-selinux-0.0.28-1.el7scon.noarch
> rhscon-ui-0.0.42-1.el7scon.noarch
> ceph-ansible-1.0.5-23.el7scon.noarch
> ceph-installer-1.0.12-3.el7scon.noarch
> 
> On Ceph Storage nodes:
> 
> rhscon-agent-0.0.13-1.el7scon.noarch
> ceph-osd-10.2.2-5.el7cp.x86_64
> 
> How reproducible
> ================
> 
> 100 %
> 
> Steps to Reproduce
> ==================
> 
> 1. Install RHSC 2.0 following the documentation.
> 
> 2. Accept few nodes for the ceph cluster.
> 
> 3. Create new ceph cluster named 'alpha'.
> 
> 4. Create 2 RBD (along with new backing pool each time) in the cluster.
> 
> 5. Check CRUSH cluster map, make sure it's ok and then make a backup of it:
> 
> ~~~
> # ceph --cluster alpha osd getcrushmap -o ceph-crushmap.ok.compiled
> # crushtool -d ceph-crushmap.ok.compiled -o ceph-crushmap.ok
> ~~~
> 
> 7. Edit CRUSH cluster map so that there are no OSDs in cluster hierarchy
>    which is used by the backing pools of RBDs created in step 4:
> 
> ~~~
> # cp ceph-crushmap.ok ceph-crushmap.err01
> # sed -i '/.*item\ osd\.[0-9]\+\ weight\ [0-9\.]\+$/d' ceph-crushmap.err01
> $ sed -i 's/weight\ [0-9\.]\+$/weight 0.000/' ceph-crushmap.err01
> ~~~
> 
> So that for example:
> 
> ~~~
> # diff ceph-crushmap.ok ceph-crushmap.err01
> 65c65
> < 	# weight 0.010
> ---
> > 	# weight 0.000
> 68d67
> < 	item osd.1 weight 0.010
> 72c71
> < 	# weight 0.010
> ---
> > 	# weight 0.000
> 75d73
> < 	item osd.2 weight 0.010
> 79c77
> < 	# weight 0.010
> ---
> > 	# weight 0.000
> 82d79
> < 	item osd.0 weight 0.010
> 86c83
> < 	# weight 0.010
> ---
> > 	# weight 0.000
> 89d85
> < 	item osd.3 weight 0.010
> 93c89
> < 	# weight 0.040
> ---
> > 	# weight 0.000
> 96,99c92,95
> < 	item mbukatov-usm1-node2.os1.phx2.redhat.com-general weight 0.010
> < 	item mbukatov-usm1-node3.os1.phx2.redhat.com-general weight 0.010
> < 	item mbukatov-usm1-node1.os1.phx2.redhat.com-general weight 0.010
> < 	item mbukatov-usm1-node4.os1.phx2.redhat.com-general weight 0.010
> ---
> > 	item mbukatov-usm1-node2.os1.phx2.redhat.com-general weight 0.000
> > 	item mbukatov-usm1-node3.os1.phx2.redhat.com-general weight 0.000
> > 	item mbukatov-usm1-node1.os1.phx2.redhat.com-general weight 0.000
> > 	item mbukatov-usm1-node4.os1.phx2.redhat.com-general weight 0.000
> ~~~
> 
> 8. Compile new broken CRUSH map and push it into the cluster:
> 
> ~~~
> # crushtool -c ceph-crushmap.err01 -o ceph-crushmap.err01.compiled
> # ceph --cluster alpha osd setcrushmap -i  ceph-crushmap.err01.compiled
> ~~~
> 
> At this point, `ceph --cluster alpha status` should report something like:
> 
>  * health HEALTH_WARN
>  * recovery ... objects misplaced (100.000%)
> 
> 9. Check output of `ceph df` command. Since you haven't loaded any data
>    into the RBDs, ceph should report that both pools are empty (0 for
>    %USED and 0 for MAX AVAIL) .
> 
> 10. Make sure the sync of cluster state happened before you go on.
> 
> You can either wait for RHSC 2.0 UI to sync new cluster state
> automatically. But it means waiting for at least 24 hours (which is
> the current default value of clustersSyncInterval).
> 
> Or you can force the sync by restarting skyring service:
> 
> ~~~
> systemctl restart skyring
> ~~~
> 
> 11. Check Pools list page.
> 
> Actual results
> ==============
> 
> While `ceph df` provides this information:
> 
> ~~~
> # ceph --cluster alpha df 
> GLOBAL:
>     SIZE       AVAIL      RAW USED     %RAW USED 
>     40915M     40760M         155M          0.38 
> POOLS:
>     NAME         ID     USED     %USED     MAX AVAIL     OBJECTS 
>     rbd_pool     1      7486         0             0          16 
>     def_pool     3       114         0             0           4
> ~~~


Here:

USED --> used space of the pool
MAX AVAIL --> Maximum size available to the pool. this will be TOTAL-USED

So the total is not provided by the ceph CLI so USM will calculate TOTAL as USED+MAX AVAIL

Also note that USM execute json fromatted API call to get the pool stats(ceph df -f json) and this CLI won't provide the percentage value so we need to calculate which is done as USED/TOTAL * 100 where TOTAL is USED+MAX AVAIL.


> 
> The *Pools* page shown different storage capacity data for both pools:
> 
>  * rbd_pool: 100.0%, 7.3 KB of 7.3 KB used 
>  * def_pool: 100.0%, 114.0 B of 114.0 B used 
> 
> (see screenshot #2)
> 
> When we ignore the wrong units there for now (there is another BZ for that,
> see https://bugzilla.redhat.com/show_bug.cgi?id=1340747#c4), the data
> presented
> there doesn't match data reported by `ceph df`:
> 
>  * '%USED' should not be reported as 100% by UI (ceph df stil shows 0%,
>    moreover both pools are essentially empty).
>  * MAX AVAIL is not shown in the pool list by UI, and the 
>    `7.3 KB of 7.3 KB used` statement is misleading in this case
> 
> The related event states:
> 
> > Pool utilization for pool rbd_pool on alpha cluster has moved to CRITICAL
> 
> which while technically true, it doesn't cover the actual issue in any way
> and admin is again forced to use ceph command line tools to debug the
> problem.
> 
> Expected results
> ================
> 
> The storage capacity information should not conflict with information as
> provided by `ceph df` command.
> 
> Statement "7.3 KB of 7.3 KB used" would be better replaced by presenting
> 'USED' and 'MAX AVAIL' values instead.
> 
> Related events should better describe the problem. At least information about
> zero value of 'MAX AVAIL' should be conveyed.
> 


I have explained the logic above and it is clear that USM is doing the calculation as expected. So I don't treat this an issue. We can discuss this in the bug scrub meeting take a call

> Additional info
> ===============
> 
> Besides the problem with interpreting the storage capacity data, the sync 
> interval itself makes things harder to understand as well:
> 
>  * See screenshot #1 which I created after I did all steps from "Steps to
>    Reproduce" section but without restarting skyring, so that the cluster
>    state is still not synced.
>  * And compare that with screenshot #2 which shows Pools list page after the
>    skyring restart (which triggered the sync).

Comment 5 anmol babu 2016-07-29 12:02:42 UTC

We now pick the utilization from the normal cli command ceph df instead of its  
json form. So the percentage used and used size is now the same as what the cli returns
and the total calculation is as follows(as per sjust's suggestions):
POOL TOTAL SIZE = MAX AVAIL / (.01 * (100 - %USED))
where MAX AVAIL and %USED are directly from ceph df cli o/p.

Comment 7 Martin Kudlej 2016-08-01 14:48:17 UTC

As you can see on screenshots utilization charts are not correct.

My configuration:
1 cluster
2 storage profiles: default - 2 OSDs, user defined - 2 OSDs
2 pools - double replication each; one on each storage profile(pool1 on default, pool2 on user_defined)
each OSD has 10GB

Please check screenshots.

Tested with 
ceph-ansible-1.0.5-31.el7scon.noarch
ceph-installer-1.0.14-1.el7scon.noarch
rhscon-ceph-0.0.39-1.el7scon.x86_64
rhscon-core-0.0.39-1.el7scon.x86_64
rhscon-core-selinux-0.0.39-1.el7scon.noarch
rhscon-ui-0.0.51-1.el7scon.noarch

Comment 8 Martin Kudlej 2016-08-01 14:51:13 UTC

Created attachment 1186442 [details]
build 39 list of pools

As you can see used is bigger than total size and % utilization is not correct.
There are 7 objects 1GB each in double replicated pool.

Comment 9 Martin Kudlej 2016-08-01 14:52:40 UTC

Created attachment 1186443 [details]
build 39 - main dashboard

Comment 10 Martin Kudlej 2016-08-01 14:53:25 UTC

Created attachment 1186444 [details]
build 39 - cluster dashboard

Comment 11 Martin Kudlej 2016-08-01 14:54:05 UTC

Created attachment 1186445 [details]
build 39 - cluster list

Comment 12 Lubos Trilety 2016-08-02 10:03:36 UTC

%USED doesn't show how much of the space for the pool is used, but how much of the space of cluster is used in the pool. With that said if I have several hierarchies the number is not correct. (At least not correct for calculating total size of pool from it.)

Comment 15 anmol babu 2016-08-10 06:09:19 UTC

Looks good to me

Comment 16 Shubhendu Tripathi 2018-11-19 05:43:08 UTC

This product is EOL now