Bug 2133547 - [GSS] Extermal OCS - unable to create user rook-ceph-object-user-ocs-external-storagecluster-cephobjectstore-noobaa-ceph-objectstore-user
Summary: [GSS] Extermal OCS - unable to create user rook-ceph-object-user-ocs-external...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: Multi-Cloud Object Gateway
Version: 4.8
Hardware: All
OS: All
urgent
urgent
Target Milestone: ---
: ---
Assignee: Nimrod Becker
QA Contact: krishnaram Karthick
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-10-10 20:02 UTC by kelwhite
Modified: 2023-08-09 16:49 UTC (History)
20 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-01-19 07:24:26 UTC
Embargoed:
brgardne: needinfo-
brgardne: needinfo-
brgardne: needinfo-
brgardne: needinfo-
paarora: needinfo-


Attachments (Terms of Use)

Comment 11 Nimrod Becker 2022-11-15 08:05:09 UTC
Per last comment, moving to Rook

Comment 48 Parth Arora 2022-12-14 13:03:11 UTC
After adding the secret I see that the `CephObjectStore` is created but the `CephObjectStoreUser` is still Reconciled failed,

In the rook logs, I see : `ObjectStore resource not ready in namespace "openshift-storage", retrying in "10s". failed to detect if object store "" is initialized: CephObjectStore "" could not be found`

I think it is because the the `cephobjectStoreUser` has store name is spec,
`spec:
  displayName: my display name`

PLease add it like 
oc edit cephobjectStoreUser
`spec:
  store: ocs-external-storagecluster-cephobjectstore
  displayName: my display name`

Adding @jiffin for keeping me honest

Comment 51 Parth Arora 2022-12-14 15:26:27 UTC
Making the comment#48 public so might help further resolving

Comment 57 Blaine Gardner 2022-12-15 20:36:41 UTC
The issue appears to be with the configuration of the noobaa-default-backing-store. It is using the old CephObjectStoreUser secret name (rook-ceph-object-user--noobaa-ceph-objectstore-user) from when the `store` wasn't specified. It should be using the new value: rook-ceph-object-user-ocs-external-storagecluster-cephobjectstore-noobaa-ceph-objectstore-user. 

I see several log items like this in the noobaa operator logs:
  2022-12-14T14:43:25.036263434Z time="2022-12-14T14:43:25Z" level=info msg="✅ Exists: BackingStore \"noobaa-default-backing-store\"\n"
  2022-12-14T14:43:25.036263434Z time="2022-12-14T14:43:25Z" level=info msg="Backing store noobaa-default-backing-store already exists. skipping ReconcileCloudCredentials" func=ReconcileDefaultBackingStore sys=openshift-storage/noobaa

As a workaround for the customer, try deleting the noobaa-default-backing-store. The Noobaa operator should create it again once it realizes the backing store no longer exists. If it doesn't, delete the noobaa operator pod, and it should be fixed soon.

This does expose what I would consider a bug in noobaa that it still creates the default backing store even if the CephObjectStoreUser isn't in Ready state. I will move this BZ to the noobaa team. Since a workaround exists (delete the default backing store), this doesn't seem like a blocker to me.

Comment 59 Blaine Gardner 2022-12-15 22:43:38 UTC
Noobaa is having trouble with the default bucket class, but I can't quite tell why. It is at least looking at the correct secret now, so the issue I was seeing earlier is fixed. Below you can see thqt the secret exists, but noobaa-default-backing-store-noobaa-noobaa isn't found.

time="2022-12-15T21:17:26Z" level=info msg="✅ Exists: Secret \"rook-ceph-object-user-ocs-external-storagecluster-cephobjectstore-noobaa-ceph-objectstore-user\"\n"
time="2022-12-15T21:17:26Z" level=info msg="❌ Not Found:  \"noobaa-default-backing-store-noobaa-noobaa\"\n"


Collecting another ODF/OCS must-gather will help us figure out if the issue is still with Noobaa or if it's a further issue with Rook. I also copied a relevant chunk of Noobaa logs here if it'll help the Noobaa team debug more quickly.



time="2022-12-15T21:17:24Z" level=info msg="Start BucketClass Reconcile..." bucketclass=openshift-storage/noobaa-default-bucket-class
time="2022-12-15T21:17:24Z" level=info msg="✅ Exists: NooBaa \"noobaa\"\n"
time="2022-12-15T21:17:24Z" level=info msg="✅ Exists: BucketClass \"noobaa-default-bucket-class\"\n"
time="2022-12-15T21:17:24Z" level=info msg="SetPhase: Verifying" bucketclass=openshift-storage/noobaa-default-bucket-class
time="2022-12-15T21:17:24Z" level=info msg="✅ Exists: BackingStore \"noobaa-default-backing-store\"\n"
time="2022-12-15T21:17:24Z" level=info msg="SetPhase: temporary error during phase \"Verifying\"" bucketclass=openshift-storage/noobaa-default-bucket-class
time="2022-12-15T21:17:24Z" level=warning msg="⏳ Temporary Error: NooBaa BackingStore \"noobaa-default-backing-store\" is not yet ready" bucketclass=openshift-storage/noobaa-default-bucket-class
time="2022-12-15T21:17:24Z" level=info msg="UpdateStatus: Done" bucketclass=openshift-storage/noobaa-default-bucket-class
time="2022-12-15T21:17:26Z" level=info msg="Start BackingStore Reconcile ..." backingstore=openshift-storage/noobaa-default-backing-store
time="2022-12-15T21:17:26Z" level=info msg="✅ Exists: NooBaa \"noobaa\"\n"
time="2022-12-15T21:17:26Z" level=info msg="✅ Exists: BackingStore \"noobaa-default-backing-store\"\n"
time="2022-12-15T21:17:26Z" level=info msg="✅ Exists: Secret \"rook-ceph-object-user-ocs-external-storagecluster-cephobjectstore-noobaa-ceph-objectstore-user\"\n"
time="2022-12-15T21:17:26Z" level=info msg="❌ Not Found:  \"noobaa-default-backing-store-noobaa-noobaa\"\n"
time="2022-12-15T21:17:26Z" level=info msg="SetPhase: Verifying" backingstore=openshift-storage/noobaa-default-backing-store
time="2022-12-15T21:17:26Z" level=info msg="SetPhase: Connecting" backingstore=openshift-storage/noobaa-default-backing-store
time="2022-12-15T21:17:26Z" level=info msg="✅ Exists: NooBaa \"noobaa\"\n"
time="2022-12-15T21:17:26Z" level=info msg="✅ Exists: Service \"noobaa-mgmt\"\n"
time="2022-12-15T21:17:26Z" level=info msg="✅ Exists: Secret \"noobaa-operator\"\n"
time="2022-12-15T21:17:26Z" level=info msg="✅ Exists: Secret \"noobaa-admin\"\n"
time="2022-12-15T21:17:26Z" level=info msg="✈️  RPC: system.read_system() Request: <nil>"
time="2022-12-15T21:17:26Z" level=info msg="✅ RPC: system.read_system() Response OK: took 24.9ms"
time="2022-12-15T21:17:26Z" level=warning msg="using existing pool but connection mismatch &{Name:noobaa-default-backing-store EndpointType:S3_COMPATIBLE Endpoint:http://rook-ceph-rgw-ocs-external-storagecluster-cephobjectstore.openshift-storage.svc:8080 Identity:NCMG52UUPB7ZM4TN66M6 Secret:Gg3Dt11M2b0rw124WIKsfmrSyhB6x1LqiXD8gx1X AuthMethod:AWS_V4} pool &{Name:noobaa-default-backing-store ResourceType:CLOUD Mode:OPTIMAL Region: PoolNodeType:BLOCK_STORE_S3 Undeletable:IN_USE CloudInfo:0xc000e3a380 MongoInfo:<nil> HostInfo:<nil> Hosts:<nil>} &{EndpointType:S3_COMPATIBLE Endpoint:http://10.20.55.72:8080 TargetBucket:nb.1642022501736.apps.ocp4.kohlerco.com Identity: NodeName: CreatedBy:operator Host: AuthMethod:AWS_V4}" backingstore=openshift-storage/noobaa-default-backing-store
time="2022-12-15T21:17:26Z" level=info msg="SetPhase: Creating" backingstore=openshift-storage/noobaa-default-backing-store
time="2022-12-15T21:17:26Z" level=info msg="✈️  RPC: account.check_external_connection() Request: {Name:noobaa-default-backing-store EndpointType:S3_COMPATIBLE Endpoint:http://rook-ceph-rgw-ocs-external-storagecluster-cephobjectstore.openshift-storage.svc:8080 Identity:NCMG52UUPB7ZM4TN66M6 Secret:Gg3Dt11M2b0rw124WIKsfmrSyhB6x1LqiXD8gx1X AuthMethod:AWS_V4}"
time="2022-12-15T21:17:26Z" level=error msg="⚠️  RPC: account.check_external_connection() Response Error: Code=CONNECTION_ALREADY_EXIST Message=Connection name already exists: noobaa-default-backing-store"
time="2022-12-15T21:17:26Z" level=info msg="SetPhase: temporary error during phase \"Creating\"" backingstore=openshift-storage/noobaa-default-backing-store
time="2022-12-15T21:17:26Z" level=warning msg="⏳ Temporary Error: Connection name already exists: noobaa-default-backing-store" backingstore=openshift-storage/noobaa-default-backing-store
time="2022-12-15T21:17:26Z" level=info msg="UpdateStatus: Done" backingstore=openshift-storage/noobaa-default-backing-store
time="2022-12-15T21:17:26Z" level=info msg="RPC: Ping (0xc00180a1e0) &{RPC:0xc00009fc20 Address:wss://noobaa-mgmt.openshift-storage.svc.cluster.local:443/rpc/ State:connected WS:0xc000368e00 PendingRequests:map[] NextRequestID:499 Lock:{state:0 sema:0} ReconnectDelay:0s cancelPings:0x63b440}"

Comment 61 Blaine Gardner 2022-12-16 16:33:17 UTC
I took a look into the must-gather, and everything seems to be working with Rook. Something is not set up correctly with Noobaa, and I'm not sure what else to suggest to try to get Noobaa into a healthy state. Unfortunately, I don't think I'd be able to help on a troubleshooting session. I was hoping that a Noobaa developer might have been assigned to and looked at this BZ already given that most operate in Israel Standard Time.

You can try increasing the Noobaa operator log verbosity level and checking those logs again. That may suggest if there is an underlying issue that could possibly be fixed. That is the best next step I can suggest.

Comment 63 kelwhite 2022-12-16 17:57:07 UTC
How do you increase noobaa operator logging verbosity? We don't have this documented anywhere from what I can find.

Comment 65 Blaine Gardner 2022-12-16 18:57:26 UTC
I'm sorry Kelson. I have never worked with Noobaa. Please direct Noobaa-related needinfos to the assignee.


Note You need to log in before you can comment on or make changes to this bug.