Bug 2036652

Summary: [HOTFIX] [OSP 16.2] - option for disabling FIX on instance name sanity check
Product: Red Hat OpenStack Reporter: Riccardo Bruzzone <rbruzzon>
Component: openstack-novaAssignee: Artom Lifshitz <alifshit>
Status: CLOSED ERRATA QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: high Docs Contact:
Priority: high    
Version: 16.2 (Train)CC: alifshit, astupnik, dasmith, dhill, eglynn, jamsmith, jelynch, jhakimra, jparker, kchamart, mgeary, sbauza, sgordon, spower, tvignaud, vromanso
Target Milestone: z2Keywords: Patch, Triaged
Target Release: 16.2 (Train on RHEL 8.4)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-nova-20.6.2-2.20220112164912.8906553.el8ost Doc Type: Bug Fix
Doc Text:
RHOSP does not support the use of a fully qualified domain name (FQDN) as the instance display name in a boot server request. The instance display name is passed from the boot server request to the `instance.hostname` field. + A recent update now sanitizes the `instance.hostname` field. The sanitization steps include replacing periods with dashes, a replacement that makes it impossible to continue using the unsupported FQDN instance display names. + This update provides a temporary workaround for customers who use a fully qualified domain name (FQDN) as the instance display name in a boot server request. It limits the scope of the sanitization to cases where the instance display name ends with a period followed by one or more numeric digits. + If you use FQDN as the instance display name in a boot server request, modify your workflow before upgrading to RHOSP 17.
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-23 22:26:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2059557    

Description Riccardo Bruzzone 2022-01-03 13:39:47 UTC
Description of problem:

A FIX introduced in OSP 16.2 (nova unconditionally turns all dots in the instance name into dashes, making the instance name a shortname) is impacting the Tenant VM deployment processes of Customers who are relying on the inadvertent support of the FQDN instance names.
In the case observed, the Customer is relaying on the instance name valorized with FQDN (e.g.: vm1-test.bdi.test.com) to support DNS sub-domains assignment on Tenant base.
 
Version-Release number of selected component (if applicable):

Red Hat OpenStack 16.2 (RHOSP16.2)


How reproducible:

Steps to Reproduce:
1. A new VM is planned with FQDN as instance name: vm1-test.bdi.test.com
2. As effect of the FIX in OSP 16.2, the VM is instantiated/create with a shortname equal to: vm1-test-bdi-test-com ("." dot replaced with "-")


Actual results:

The Customer deployment process based on DNS sub-domains is broken from the behavior changed in OSP 16.2.

Before this FIX, invoking "hostnamectl set-hostname" via cloud-init and using the instance name (vm1-test.bdi.test.com) as argument, the same VM was instantiated with:
- vm1-test as hostname (short name).
- .bdi.test.com as domain.


As result of the FIX introduced in 16.2, the cloud-init approach isn't more feasible due to the lost dots.


Expected results:

The "instance name sanity check" should be handle as an optional configuration to be enabled or disabled on discretion of the Customers.
This capability could be disabled by Customers with provisioning processes based on internal sanity checks already in place.  


Additional info:
NA

Comment 3 Artom Lifshitz 2022-01-05 21:45:46 UTC
Recapping here what we discussed during a call with Riccardo, Sean, Sylvain, and myself.

Engineering explained that we don't want to add a config option to guard the fix from BZ 1919855. If such an option were put in place, different clouds would show different values for instance.hostname, depending on whether the deployer has turned on the config option or not. We want to avoid this kind of config-driven API behaviour as much as possible.

Engineering proposed that the customer switch to using the instance display name directly. This value is never sanitized, and corresponds to the name passed to `openstack server create`. It is also exposed to the guest via the metadata API as the "name" key. Riccardo agreed to propose this to the customer.

I'll leave this BZ open for now, to track any follow-up actions that might arise, while setting a needinfo on Riccardo to report back on his conversations with the customer.

Cheers!

Comment 13 Riccardo Bruzzone 2022-01-31 09:42:13 UTC
Hi Artom,
From the "Comment 9", it looks like only openstack-nova-api image should be updated.
Could you confirm me if this is the only image to be updated ?

BR
Riccardo

Comment 14 Artom Lifshitz 2022-01-31 14:34:01 UTC
Yep, as as discussed on IRC, only the api container on all controllers needs to be updated and restarted.

Comment 19 Artom Lifshitz 2022-03-07 17:37:31 UTC
The original fix for this BZ was flawed - it worked for the customer because it effectively turned off all sanitization of periods to dashes - but it is a regression for [1], as it would re-introduce the broken behaviour in PSI. [2] is the fix-on-top-of-the-fix, and we need to include it in 16.2.2. Thus, asking for the blocker flag.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1919855
[2] https://code.engineering.redhat.com/gerrit/c/nova/+/315417

Comment 33 errata-xmlrpc 2022-03-23 22:26:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenStack Platform 16.2 (openstack-nova) security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0999