Bug 2210980

Summary:	[GSS][cephfs] MDS pods frequently going to CLBO state
Product:	[Red Hat Storage] Red Hat OpenShift Data Foundation	Reporter:	Karun Josy <kjosy>
Component:	ceph	Assignee:	Dhairya Parmar <dparmar>
ceph sub component:	CephFS	QA Contact:	Elad <ebenahar>
Status:	CLOSED WONTFIX	Docs Contact:
Severity:	high
Priority:	high	CC:	bniver, dparmar, gfarnum, hnallurv, muagarwa, mvardhan, ocs-bugs, odf-bz-bot, sostapov, vshankar
Version:	4.10
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2023-06-21 14:20:46 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Karun Josy 2023-05-30 08:13:36 UTC

# Description of problem (please be detailed as possible and provide log
snippets):

MDS pods are in a constant CLBO state and restart loop.
The state changes from
state up:active
state up:reconnect
state up:rejoin


This happened after the upgrade from OCS 4.8 to ODF 4.9



# Version of all relevant components (if applicable):
OCP 4.11
ODF 4.9
ceph version 16.2.0-152.el8cp


# Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes, cephfs PV are down

# Is there any workaround available to the best of your knowledge?
No


# Additional info:
More details will be added in the next comments