Bug 2124079 - Compute services failed to start when compute node hostname contains underscore(_)
Summary: Compute services failed to start when compute node hostname contains undersco...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.2 (Train)
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: ---
Assignee: OSP Team
QA Contact: Joe H. Rahme
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-09-04 14:51 UTC by Rahul Kaushal
Modified: 2022-09-21 17:45 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-09-21 07:46:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1868940 0 high CLOSED oslo_config not accepting undercore in the hostnames 2023-09-15 01:30:16 UTC
Red Hat Issue Tracker NFV-2629 0 None None None 2022-09-08 01:51:34 UTC
Red Hat Issue Tracker OSP-18543 0 None None None 2022-09-04 14:53:13 UTC

Description Rahul Kaushal 2022-09-04 14:51:09 UTC
Description of problem:
Compute services failed to start when compute node hostname contains underscore(_).

We have tried Deployment of RHSOP16.2.3 with compute node name contains underscore(_) and encountered this issue.

As per Redhat, Bug fixes for RHOSP16.2.3 this bug is fixed but still we have encountered this issue.  


Error Snippet
nova-compute service to register", "INFO:nova_wait_for_compute_service:Waiting for nova-compute service to register", "INFO:nova_wait_
for_compute_service:Waiting for nova-compute service to register", "INFO:nova_wait_for_compute_service:Waiting for nova-compute service t
o register", "INFO:nova_wait_for_compute_service:Waiting for nova-compute service to register", "INFO:nova_wait_for_compute_service:Waiti
ng for nova-compute service to register"]}
2022-08-16 13:58:58.903480 | 5254008c-3c73-2b12-cf25-000000009321 |     TIMING | Wait for containers to start for step 5 using paunch | o
vercloud-compute_sriov-1 | 1:21:10.099288 | 625.79s

PLAY RECAP *********************************************************************
overcloud-cephstorage-0    : ok=292  changed=150  unreachable=0    failed=0    skipped=148  rescued=0    ignored=0
overcloud-cephstorage-1    : ok=276  changed=149  unreachable=0    failed=0    skipped=146  rescued=0    ignored=0
overcloud-cephstorage-2    : ok=278  changed=150  unreachable=0    failed=0    skipped=145  rescued=0    ignored=0
overcloud-compute_sriov-0  : ok=328  changed=188  unreachable=0    failed=1    skipped=145  rescued=0    ignored=0
overcloud-compute_sriov-1  : ok=328  changed=188  unreachable=0    failed=1    skipped=145  rescued=0    ignored=0





Version-Release number of selected component (if applicable):
16.2.3


How reproducible:


Steps to Reproduce:
1. Add a hostname with underscord in roles data file of compute node
#Snippet of roles data file 
###############################################################################
# Role: ComputeSriov                                                          #
###############################################################################
- name: ComputeSriov
  description: |
    Compute SR-IOV Role
  HostnameFormatDefault: '%stackname%-Compute_Sriov-%index%'


2. Execute Deployment command
3. Deployment failed with error 




Actual results:
Service fails to start

Expected results:
Service start normally with compute host name contains underscore.

Additional info:
#Nova compute node logs for computenode0
2022-08-16 11:00:22.322 7 CRITICAL nova [req-0f6c83ae-7905-4797-927e-f3cf35a00cdb - - - - -] Unhandled error: oslo_config.cfg.ConfigFileV
alueError: Value for option live_migration_inbound_addr from LocationInfo(location=<Locations.user: (4, True)>, detail='/etc/nova/nova.co
nf') is not valid: overcloud-compute_sriov-0.internalapi.localdomain is not a valid host address
2022-08-16 11:00:22.322 7 ERROR nova Traceback (most recent call last):


#Nova compute node logs for computenode1
2022-08-16 11:00:32.525 7 WARNING oslo_config.cfg [req-24321e9c-1265-4840-9396-ef2dd9507c13 - - - - -] Deprecated: Option "dhcp_domain" from group "DEFAULT" is deprecated. Use option "dhcp_domain" from group "api".
2022-08-16 11:00:32.536 7 CRITICAL nova [req-24321e9c-1265-4840-9396-ef2dd9507c13 - - - - -] Unhandled error: oslo_config.cfg.ConfigFileValueError: Value for option live_migration_inbound_addr from LocationInfo(location=<Locations.user: (4, True)>, detail='/etc/nova/nova.conf') is not valid: overcloud-compute_sriov-1.internalapi.localdomain is not a valid host address

Comment 4 Hervé Beraud 2022-09-05 09:17:12 UTC
Hello Rahul,

Which version of oslo.config are you using?

Comment 5 Rahul Kaushal 2022-09-05 09:29:24 UTC
(In reply to Hervé Beraud from comment #4)
> Hello Rahul,
> 

Hi Herve, 

Thanks for your prompt response.
> Which version of oslo.config are you using?
Please find the rpm version below for both compute nodes.

[root@overcloud-compute-sriov-1 ~]# rpm -qa | grep oslo-config
python3-oslo-config-6.11.3-2.20210712154811.9b1ccea.el8ost.noarch
[root@overcloud-compute-sriov-1 ~]#

[root@overcloud-compute-sriov-0 ~]# rpm -qa | grep oslo-config
python3-oslo-config-6.11.3-2.20210712154811.9b1ccea.el8ost.noarch
[root@overcloud-compute-sriov-0 ~]#

Comment 6 Hervé Beraud 2022-09-06 13:18:35 UTC
Hello,

From an oslo view point we can't do more than what we already did so far (for further details see https://bugzilla.redhat.com/show_bug.cgi?id=1868940). The lib have been adapted to handle underscore through a new function. However, adapting the existing function isn't something that we want to do.

The current usecase of this BZ is a hostname with an _, which is explicitly disallowed by the RFC. In fact, the class in question that handle this kind of address is named "Hostname", which leads me to believe it should not allow _.
I would argue that our validation is working correctly here. The user attempted to use an invalid hostname and we caught it.

If there is a need for non-hostname DNS names to be set somewhere we should probably create a new type for that that does allow _, this is what we did with https://bugzilla.redhat.com/show_bug.cgi?id=1868940, and to work it would require nova changes (https://review.opendev.org/c/openstack/nova/+/792501), but it seems inappropriate to me to allow it in the hostname type itself.

Rather I move this BZ to the nova team to see what they are thinking about backporting downstream https://review.opendev.org/c/openstack/nova/+/792501

We should notice that these nova changes are more a feature than a fix and they also require a requirements update.

Let us know if adopting these nova changes downstream looks sensible.

Comment 7 melanie witt 2022-09-08 01:44:21 UTC
We discussed this on our bug triage call today and we agree with Hervé that the validation is working as expected for a *hostname* -- underscores are not allowed in hostnames.

The configuration parameter shown in comment 0 is meant to be a hostname, judging from its name "HostnameFormatDefault":

- name: ComputeSriov
  description: |
    Compute SR-IOV Role
  HostnameFormatDefault: '%stackname%-Compute_Sriov-%index%'

The BZ referenced [1] added a new type in oslo.config in Train in order to make the use of underscores in domain names possible in general. However, the nova adoption of the new config type was not implemented until Xena and although a backport was proposed to the stable/wallaby branch [2], it was correctly rejected because it involves bumping the oslo.config version requirement (8.6.0) beyond the version upper constraint (8.5.1) for stable/wallaby [3].

If HostnameFormatDefault in the SRIOV context is indeed meant to be a hostname, perhaps there is a documentation improvement that could be done to add a note that underscores are not allowed in the parameter value.

We are deferring this BZ to the NFV DFG to determine if a SRIOV documentation update would be appropriate and if not, they can feel free to go ahead and close this BZ.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1868940
[2] https://review.opendev.org/c/openstack/nova/+/792501
[3] https://github.com/openstack/requirements/blob/stable/wallaby/upper-constraints.txt#L392


Note You need to log in before you can comment on or make changes to this bug.