2297022 – ODF nodes require a manual reboot after compute vmotion is performed due to a disk change order on the nodes

Bug 2297022 - ODF nodes require a manual reboot after compute vmotion is performed due to a disk change order on the nodes [NEEDINFO]

Summary: ODF nodes require a manual reboot after compute vmotion is performed due to a...

Keywords:
Status:	NEW
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	ceph
Sub Component:
Version:	4.14
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Kotresh HR
QA Contact:	Elad
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2024-07-10 05:40 UTC by palshure
Modified:	2024-09-10 12:48 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:
Flags:	vshankar: needinfo? (palshure)

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	OCSBZM-8664	0	None	None	None	2024-07-16 13:32:24 UTC

Description palshure 2024-07-10 05:40:13 UTC

Description of problem (please be detailed as possible and provide log
snippests):
Case 1:
Whether compute vmotion is performed manually or automatically, ODF becomes unhealthy and a disk order change on the nodes can be observed. Rebooting the nodes solves the issue.

Case 2:
Powering off the VM, then doing compute vMotion, then powering it back on (all manual activities) does not cause the issue, but it is not an optimal situation.

Version of all relevant components (if applicable):
4.14.6

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
disabling compute vmotion on storage nodes

Is there any workaround available to the best of your knowledge?
powering the nodes off first

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
yes

Can this issue reproduce from the UI?
n/a

If this is a regression, please provide more details to justify this:
n/a

Steps to Reproduce:
Described earlier

Actual results:
ODF nodes require a manual reboot after compute vmotion is performed due to a disk change order on the nodes

Expected results:
ODF should recover automatically after compute vmotion

Additional info:
disk.EnableUUID parameter is enabled, OSD disks deployed via lso are persistent.

Note You need to log in before you can comment on or make changes to this bug.