Bug 2181535

Summary: [GSS] Object storage in degraded state
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Manjunatha <mmanjuna>
Component: Multi-Cloud Object GatewayAssignee: Utkarsh Srivastava <usrivast>
Status: CLOSED ERRATA QA Contact: Tiffany Nguyen <tunguyen>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.11CC: bhull, bkunal, bskopova, dzaken, jalbo, jquinn, kbg, kelwhite, kramdoss, mduasope, nbecker, ocs-bugs, odf-bz-bot, pollenbu, tdesala, tunguyen, usrivast
Target Milestone: ---   
Target Release: ODF 4.13.0   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 4.13.0-172 Doc Type: Bug Fix
Doc Text:
Previously, non-optimized database related flows on deletions caused Multicloud Object Gateway to spike in CPU usage and perform slowly on mass delete scenarios. For example, reclaiming a deleted object bucket claim (OBC). With this fix, indexes for the bucket reclaimer process are optimized, a new index is added to the database to speed up the database cleaner flows, and bucket reclaimer changes are introduced to work on batches of objects.
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-06-21 15:25:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2154341, 2186482    

Description Manjunatha 2023-03-24 13:00:36 UTC
Description of problem (please be detailed as possible and provide log
snippests):
Object storage in a degraded state and customer is unable to access the buckets using "s3" command.
When I checked noobaa-default-bucket-class is in rejected state its using backingstore noobaa-pv-backing-store, which is in "ALL_NODES_OFFLINE" state , This backingstore is created on rbd PV and that PV  looks good

History of the issue: Cluster becomes full(above 80%) so we deleted the unwanted PV's to get the free space after this, issue with object storage started. 

Version of all relevant components (if applicable):
odf-operator.v4.11.5

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
yes, unable to access the object storage. 

Is there any workaround available to the best of your knowledge?
No

Can this issue reproducible?
not sure

Additional info:
Latest ODF mustgather in supportshell 
path: /cases/03468361/0050-must-gather-odf-24032023.tar.gz.gz

Comment 13 pollenbu 2023-03-28 08:29:26 UTC
Is there any update to this case?

Comment 39 Tiffany Nguyen 2023-05-12 17:27:29 UTC
Verified with ODF 4.13 build 4.13.0-186, increase noobaa pods resources, then upload and list 1M objects without any issues.
Delete obc and backingstore are successfully.

$ oc get storagecluster -n openshift-storage ocs-storagecluster -oyaml | yq '.spec.resources'

mgr:
  limits:
    cpu: "3"
    memory: 3Gi
  requests:
    cpu: "3"
    memory: 3Gi
noobaa-core:
  limits:
    cpu: "3"
    memory: 4Gi
  requests:
    cpu: "3"
    memory: 4Gi
noobaa-db:
  limits:
    cpu: "3"
    memory: 4Gi
  requests:
    cpu: "3"
    memory: 4Gi
noobaa-endpoint:
  limits:
    cpu: "3"
    memory: 4Gi
  requests:
    cpu: "3"
    memory: 4Gi

Comment 42 errata-xmlrpc 2023-06-21 15:25:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Data Foundation 4.13.0 enhancement and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:3742