2314998 – [ODF on ROSA HCP] MDSCacheUsageHigh not found with active node drained

Bug 2314998 - [ODF on ROSA HCP] MDSCacheUsageHigh not found with active node drained

Summary: [ODF on ROSA HCP] MDSCacheUsageHigh not found with active node drained

Keywords:
Status:	ASSIGNED
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	ocs-operator
Sub Component:
Version:	4.17
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Mudit Agarwal
QA Contact:	Elad
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2024-09-26 20:06 UTC by Daniel Osypenko
Modified:	2024-10-17 10:33 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	OCSBZM-9293	0	None	None	None	2024-10-03 11:33:03 UTC

Description Daniel Osypenko 2024-09-26 20:06:56 UTC

Description of problem (please be detailed as possible and provide log
snippests):
During the test execution of the test_mds_cache_alert_with_active_node_drain we were running metadata io with cephfs by steps:
    1. Create PVC with Cephfs, access mode RWX
    2. Create dc pod with Fedora image
    3. Copy helper_scripts/meta_data_io.py to Fedora dc pod
    4. Run meta_data_io.py on fedora pod 
script can be found by link https://github.com/red-hat-storage/ocs-ci/blob/e4bcbb284280862d03b7f6b5ab2b40e2727482f3/ocs_ci/templates/workloads/helper_scripts/meta_data_io.py

This script triggers high cache usage in scenario when standby-replay mds scaled down, but does not trigger when active node drained, showing the problem is related to active mds node disruption happens

Version of all relevant components (if applicable):
OC version:
Client Version: 4.16.11
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: 4.16.12
Kubernetes Version: v1.29.8+f10c92d

OCS version:
ocs-operator.v4.16.2-rhodf              OpenShift Container Storage        4.16.2-rhodf   ocs-operator.v4.16.1-rhodf              Succeeded

ODF operator full version: 4.16.2-4

Cluster version:
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.16.12   True        False         12h     Error while reconciling 4.16.12: the cluster operator insights is not available


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
pottentially

Is there any workaround available to the best of your knowledge?
no

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
3

Can this issue reproducible?
1/1

Can this issue reproduce from the UI?
no

If this is a regression, please provide more details to justify this:
new deployment. Tech preview

Steps to Reproduce:
1. Deploy ROSA HCP cluster with ODF and run test_mds_cache_alert_with_active_node_drain
2.
3.


Actual results:
There was not found alert MDSCacheUsageHigh

Expected results:
MDSCacheUsageHigh is fired when conditions met

Additional info:
cluster to capture necessary data will be created upon request to qe

Note You need to log in before you can comment on or make changes to this bug.