Bug 1899587

Summary: [External] RGW usage metrics shown on Object Service Dashboard is incorrect
Product: OpenShift Container Platform Reporter: Rachael <rgeorge>
Component: Console Storage PluginAssignee: Bipul Adhikari <badhikar>
Status: CLOSED ERRATA QA Contact: Rachael <rgeorge>
Severity: high Docs Contact:
Priority: high    
Version: 4.6CC: aos-bugs, asachan, assingh, nthomas, pcuzner, uchapaga
Target Milestone: ---Keywords: PrioBumpQA
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 22:34:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1929048    
Bug Blocks: 1964400    

Description Rachael 2020-11-19 15:50:41 UTC
Description of problem:
On an OCS external mode cluster, where noobaa is configured to use pv-pool for the backingstore, the RGW usage metrics shown in the Capacity Breakdown card on the Object Service dashboard is incorrect. The used space for RGW shown in the dashboard is less than the used space for that pool on the RHCS cluster.

On increasing the IO on noobaa OBCs and keeping the IO on the RGW OBCs almost constant, it was observed that the used space of RGW kept decreasing. It appears to be an error in how these values are calculated.

# ceph df |grep rgw
    .rgw.root                      3     2.1 KiB           5     960 KiB         0       1.5 TiB 
    default.rgw.control            4         0 B           8         0 B         0       1.5 TiB 
    default.rgw.meta               5      15 KiB          63     9.4 MiB         0       1.5 TiB 
    default.rgw.log                6     3.4 KiB         206       6 MiB         0       1.5 TiB 
    default.rgw.buckets.index      8       976 B         165       976 B         0       1.5 TiB 
    default.rgw.buckets.data       9     4.1 GiB       4.16k      12 GiB      0.26       1.5 TiB 


$ oc describe backingstore noobaa-default-backing-store
Name:         noobaa-default-backing-store
Namespace:    openshift-storage
Labels:       app=noobaa
Annotations:  <none>
API Version:  noobaa.io/v1alpha1
Kind:         BackingStore
Metadata:
  Creation Timestamp:  2020-11-19T09:23:24Z
  Finalizers:
    noobaa.io/finalizer
  Generation:  1
  Managed Fields:
    API Version:  noobaa.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
        f:labels:
          .:
          f:app:
        f:ownerReferences:
      f:spec:
        .:
        f:pvPool:
          .:
          f:numVolumes:
          f:resources:
            .:
            f:requests:
              .:
              f:storage:
          f:secret:
        f:type:
      f:status:
        .:
        f:conditions:
        f:mode:
          .:
          f:modeCode:
          f:timeStamp:
        f:phase:
    Manager:    noobaa-operator
    Operation:  Update
    Time:       2020-11-19T12:51:10Z
  Owner References:
    API Version:           noobaa.io/v1alpha1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  NooBaa
    Name:                  noobaa
    UID:                   e832c40c-4c61-45a2-800d-e7d93261c76c
  Resource Version:        161472
  Self Link:               /apis/noobaa.io/v1alpha1/namespaces/openshift-storage/backingstores/noobaa-default-backing-store
  UID:                     3bdcfbad-74e3-4844-8640-ae5d59543685
Spec:
  Pv Pool:
    Num Volumes:  1
    Resources:
      Requests:
        Storage:  50Gi
    Secret:
  Type:  pv-pool


Screenshots of the metrics and the Object Service dashboards can be found here: http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/rgw_usage/


Version-Release number of selected component (if applicable):
OCP: 4.6.0-0.nightly-2020-11-18-154058
OCS: ocs-operator.v4.6.0-160.ci

How reproducible: 2/2


Steps to Reproduce:
1. Deploy an external mode OCS cluster without RGW, this will configure noobaa to use pv-pool for its backingstore
2. After the deployment, update the rook-ceph-external-cluster-details secret with RGW details, to allow OCS to use the RGW on the RHCS cluster (https://bugzilla.redhat.com/show_bug.cgi?id=1865825)
3. Create Noobaa and RGW OBCs and run IO on them
4. Compare the used space/usage of RGW on the Object Service Dashboard and RHCS cluster

Actual results:
The usage shown on the dashboard is less than the usage on the RHCS cluster for that rgw pool

Expected results:
The used space should be equal

Comment 7 Nishanth Thomas 2021-02-04 09:24:54 UTC
This Bz is of high priority but I don't think this should block the ocp release. Try to fix it before the final freeze of 4,7, else we will push this into a batch update. Un-setting the blocker flag.

Comment 12 errata-xmlrpc 2021-07-27 22:34:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438