Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2242123

Summary: Document workaround to block nova-compute startup if symptom of host rename has been detected
Product: Red Hat OpenStack Reporter: Joanne O'Flynn <joflynn>
Component: documentationAssignee: Joanne O'Flynn <joflynn>
Status: CLOSED CURRENTRELEASE QA Contact: RHOS Documentation Team <rhos-docs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 16.2 (Train)CC: alifshit
Target Milestone: z6Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-08 09:52:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2240825    
Bug Blocks:    

Description Joanne O'Flynn 2023-10-04 13:42:48 UTC
All of Nova's resource tracking (and adjacent tracking of things, like the records we write to various resources in Cinder, Neutron, and Placement) presuppose permanent, unchanging, host names.

Sometimes intentionally, more often than not by accident, the host name changes. Nova has no safeguards in place for that case, starts up as normal, and when instances are created on or migrated to that host, resource tracking breaks.

There is a new safeguard in place as of OSP 18 (upstream Antelope) [1], and in the process of being backported to 17.1 [2], but backporting it to 16.2 is not realistic.

In its stead, do a much more simple downstream-only workaround that aborts startup if there are libvirt domains on the node, but no compute node record exists. Such a situation can only arise if something went horribly wrong, most likely a compute host rename, so aborting startup makes sense.

Version-Release number of selected component (if applicable):

16.2

How reproducible:

100%

Steps to Reproduce:
1. Rename a compute (this normally happens by accident)
2. Restart nova-compute
3. Do instance operations (create, migrate, etc) on the renamed compute host.

Actual results:

Resource tracking explodes and everything is on fire.

Expected results:

Not fires?

Additional info:

[1] https://review.opendev.org/c/openstack/nova/+/863920/17
[2] https://bugzilla.redhat.com/show_bug.cgi?id=2192710

Comment 2 Joanne O'Flynn 2023-10-24 12:47:28 UTC
*** Bug 2245880 has been marked as a duplicate of this bug. ***

Comment 8 Joanne O'Flynn 2023-11-08 09:52:31 UTC
Published KBA:

https://access.redhat.com/articles/7040980