Description of problem: Compute services failed to start when compute node hostname contains underscore(_). We have tried Deployment of RHSOP16.2.3 with compute node name contains underscore(_) and encountered this issue. As per Redhat, Bug fixes for RHOSP16.2.3 this bug is fixed but still we have encountered this issue. Error Snippet nova-compute service to register", "INFO:nova_wait_for_compute_service:Waiting for nova-compute service to register", "INFO:nova_wait_ for_compute_service:Waiting for nova-compute service to register", "INFO:nova_wait_for_compute_service:Waiting for nova-compute service t o register", "INFO:nova_wait_for_compute_service:Waiting for nova-compute service to register", "INFO:nova_wait_for_compute_service:Waiti ng for nova-compute service to register"]} 2022-08-16 13:58:58.903480 | 5254008c-3c73-2b12-cf25-000000009321 | TIMING | Wait for containers to start for step 5 using paunch | o vercloud-compute_sriov-1 | 1:21:10.099288 | 625.79s PLAY RECAP ********************************************************************* overcloud-cephstorage-0 : ok=292 changed=150 unreachable=0 failed=0 skipped=148 rescued=0 ignored=0 overcloud-cephstorage-1 : ok=276 changed=149 unreachable=0 failed=0 skipped=146 rescued=0 ignored=0 overcloud-cephstorage-2 : ok=278 changed=150 unreachable=0 failed=0 skipped=145 rescued=0 ignored=0 overcloud-compute_sriov-0 : ok=328 changed=188 unreachable=0 failed=1 skipped=145 rescued=0 ignored=0 overcloud-compute_sriov-1 : ok=328 changed=188 unreachable=0 failed=1 skipped=145 rescued=0 ignored=0 Version-Release number of selected component (if applicable): 16.2.3 How reproducible: Steps to Reproduce: 1. Add a hostname with underscord in roles data file of compute node #Snippet of roles data file ############################################################################### # Role: ComputeSriov # ############################################################################### - name: ComputeSriov description: | Compute SR-IOV Role HostnameFormatDefault: '%stackname%-Compute_Sriov-%index%' 2. Execute Deployment command 3. Deployment failed with error Actual results: Service fails to start Expected results: Service start normally with compute host name contains underscore. Additional info: #Nova compute node logs for computenode0 2022-08-16 11:00:22.322 7 CRITICAL nova [req-0f6c83ae-7905-4797-927e-f3cf35a00cdb - - - - -] Unhandled error: oslo_config.cfg.ConfigFileV alueError: Value for option live_migration_inbound_addr from LocationInfo(location=<Locations.user: (4, True)>, detail='/etc/nova/nova.co nf') is not valid: overcloud-compute_sriov-0.internalapi.localdomain is not a valid host address 2022-08-16 11:00:22.322 7 ERROR nova Traceback (most recent call last): #Nova compute node logs for computenode1 2022-08-16 11:00:32.525 7 WARNING oslo_config.cfg [req-24321e9c-1265-4840-9396-ef2dd9507c13 - - - - -] Deprecated: Option "dhcp_domain" from group "DEFAULT" is deprecated. Use option "dhcp_domain" from group "api". 2022-08-16 11:00:32.536 7 CRITICAL nova [req-24321e9c-1265-4840-9396-ef2dd9507c13 - - - - -] Unhandled error: oslo_config.cfg.ConfigFileValueError: Value for option live_migration_inbound_addr from LocationInfo(location=<Locations.user: (4, True)>, detail='/etc/nova/nova.conf') is not valid: overcloud-compute_sriov-1.internalapi.localdomain is not a valid host address
Hello Rahul, Which version of oslo.config are you using?
(In reply to Hervé Beraud from comment #4) > Hello Rahul, > Hi Herve, Thanks for your prompt response. > Which version of oslo.config are you using? Please find the rpm version below for both compute nodes. [root@overcloud-compute-sriov-1 ~]# rpm -qa | grep oslo-config python3-oslo-config-6.11.3-2.20210712154811.9b1ccea.el8ost.noarch [root@overcloud-compute-sriov-1 ~]# [root@overcloud-compute-sriov-0 ~]# rpm -qa | grep oslo-config python3-oslo-config-6.11.3-2.20210712154811.9b1ccea.el8ost.noarch [root@overcloud-compute-sriov-0 ~]#
Hello, From an oslo view point we can't do more than what we already did so far (for further details see https://bugzilla.redhat.com/show_bug.cgi?id=1868940). The lib have been adapted to handle underscore through a new function. However, adapting the existing function isn't something that we want to do. The current usecase of this BZ is a hostname with an _, which is explicitly disallowed by the RFC. In fact, the class in question that handle this kind of address is named "Hostname", which leads me to believe it should not allow _. I would argue that our validation is working correctly here. The user attempted to use an invalid hostname and we caught it. If there is a need for non-hostname DNS names to be set somewhere we should probably create a new type for that that does allow _, this is what we did with https://bugzilla.redhat.com/show_bug.cgi?id=1868940, and to work it would require nova changes (https://review.opendev.org/c/openstack/nova/+/792501), but it seems inappropriate to me to allow it in the hostname type itself. Rather I move this BZ to the nova team to see what they are thinking about backporting downstream https://review.opendev.org/c/openstack/nova/+/792501 We should notice that these nova changes are more a feature than a fix and they also require a requirements update. Let us know if adopting these nova changes downstream looks sensible.
We discussed this on our bug triage call today and we agree with Hervé that the validation is working as expected for a *hostname* -- underscores are not allowed in hostnames. The configuration parameter shown in comment 0 is meant to be a hostname, judging from its name "HostnameFormatDefault": - name: ComputeSriov description: | Compute SR-IOV Role HostnameFormatDefault: '%stackname%-Compute_Sriov-%index%' The BZ referenced [1] added a new type in oslo.config in Train in order to make the use of underscores in domain names possible in general. However, the nova adoption of the new config type was not implemented until Xena and although a backport was proposed to the stable/wallaby branch [2], it was correctly rejected because it involves bumping the oslo.config version requirement (8.6.0) beyond the version upper constraint (8.5.1) for stable/wallaby [3]. If HostnameFormatDefault in the SRIOV context is indeed meant to be a hostname, perhaps there is a documentation improvement that could be done to add a note that underscores are not allowed in the parameter value. We are deferring this BZ to the NFV DFG to determine if a SRIOV documentation update would be appropriate and if not, they can feel free to go ahead and close this BZ. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1868940 [2] https://review.opendev.org/c/openstack/nova/+/792501 [3] https://github.com/openstack/requirements/blob/stable/wallaby/upper-constraints.txt#L392