Bug 2168814 - rook-ceph-operator is unable to talk with radosgw
Summary: rook-ceph-operator is unable to talk with radosgw
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ceph
Version: 4.10
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: ---
Assignee: Matt Benjamin (redhat)
QA Contact: Elad
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-02-10 05:51 UTC by Hector Vido
Modified: 2023-08-09 16:37 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-03-16 16:26:35 UTC
Embargoed:


Attachments (Terms of Use)

Description Hector Vido 2023-02-10 05:51:49 UTC
Description of problem (please be detailed as possible and provide log
snippests):

An baremetal Openshift cluster with external Ceph shut off after high temperature detected.
Only 4 machines stayed on from a total of 29.

Version of all relevant components (if applicable):

Openshift: 4.10.45
ODF: 4.10.9
Ceph: 16.2.8-85

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

Yes, the customer can't write any data into buckets, and buckets is the main storage for this datalake.

Is there any workaround available to the best of your knowledge?

Unfortunately, no.

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?

1 - very simple (machines just shut off)

Can this issue reproducible?

Don't know, it happens after the outage.

We just turned on the machines.

Can this issue reproduce from the UI?

Yes, we can see 500 error in ceph dashboard when we try to edit some buckets.


Actual results:

Can't create buckets from Openshift and operator logs keeps complaining about "failed to fetch user" and "nosuchbucket". 

Expected results:

Create buckets from Openshift.

Additional info:

Comment 25 Hector Vido 2023-02-10 21:43:29 UTC
Sorry, I told it was not a bug but it could be.

We don't change anything from our side, and this Ceph cluster was working for more our less 4 months together with Openshift.
Maybe something happened and we missed.

Best,

Comment 26 Travis Nielsen 2023-02-28 15:18:19 UTC
Not sure anything to be done here, but moving to rgw component to confirm if bucket metadata could somehow be updated to cause this.


Note You need to log in before you can comment on or make changes to this bug.