Bug 2216139 - [GSS] Unable to recreate noobaa once it´s deleted [NEEDINFO]
Summary: [GSS] Unable to recreate noobaa once it´s deleted
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: rook
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Blaine Gardner
QA Contact: Neha Berry
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-20 09:11 UTC by amansan
Modified: 2023-08-09 17:03 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:
tnielsen: needinfo? (mduasope)
brgardne: needinfo? (amanzane)
brgardne: needinfo? (rafrojas)


Attachments (Terms of Use)

Description amansan 2023-06-20 09:11:22 UTC
Description of problem (please be detailed as possible and provide log snippests):

Due inconsistency in noobaa-db-pg-0 the customer finally agrees to rebuild noobaa to be sure we have a stable configuration

Version of all relevant components (if applicable):

ODF 4.10

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

Yes, noobaa is not rebuilt and the customer need it to configure the applications


Is there any workaround available to the best of your knowledge?

No


Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)?

3

Can this issue reproducible?

In the customer site

Actual results:

Noobaa is not rebuilt

Expected results:

Noobaa working

Additional info:

Comment 13 Blaine Gardner 2023-07-11 21:51:29 UTC
@amanzane please collect an OCP must-gather. I can see that the configmaps have deletion timestamps, but they are not being deleted by the openshift system. I don't see any logs from OpenShift (like the kubelet) that would indicate what might be going wrong there.

I don't believe there is an RBAC issue. If that were the case, Rook would be reporting an error related to permissions. Instead, it is reporting a timeout waiting for the configmap to be deleted.

It's possible that this is an openshift bug of some kind, or that the OCP cluster is in a degraded state.

Comment 18 Blaine Gardner 2023-07-19 21:19:27 UTC
I think just `oc adm node-logs` will be sufficient. It should contain logs for kubelet and other host processes.

https://access.redhat.com/documentation/en-us/openshift_container_platform/4.13/html/support/gathering-cluster-data#querying-cluster-node-journal-logs_gathering-cluster-data


Note You need to log in before you can comment on or make changes to this bug.