Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2120768

Summary:	[RFE] configurable scheduling back to a compute node after instance ha fences it
Product:	Red Hat OpenStack	Reporter:	Jeremy <jmelvin>
Component:	openstack-tripleo-common	Assignee:	Nobody <nobody>
Status:	CLOSED MIGRATED	QA Contact:	David Rosenfeld <drosenfe>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	18.0 (Zed)	CC:	bgibizer, dasmith, eglynn, jhakimra, kchamart, lmiccini, mburns, sbauza, sgordon, slinaber, spower, vromanso
Target Milestone:	---	Keywords:	FutureFeature, RFE
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2024-01-05 15:04:02 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Jeremy 2022-08-23 17:22:29 UTC

Description of problem:



Customer does not like the fact that instance ha automatically starts nova-compute after the compute node comes back up from fencing.  The reasoning is most of the failures they see are around memory going bad, so if memory goes bad and the compute comes back up without that dimm , therefore less ram the compute still isn't ready for usage. If vms start getting scheduled there again after fencing they still have to manually disable compute service and migrate vms off to fix the hardware.  We are wondering if there is a way to have the admin confirm the compute is good before allowing scheduling to continue to that node?  We tried to play with disabling compute unfence trigger since the docs say that is what unfences the node when it comes back up; that didn't work. Manually disabling the compute service doesn't seem like a good option either since the admin may not know exactly when fencing happens.