Currently, the Red Hat OpenStack Platform director 10 with SRIOV overcloud deployment fails when using the NIC IDs (for example, nic1, nic2, nic3 and so on) in the compute.yaml file.
As a workaround, you need to use NIC names (for example, ens1f0, ens1f1, ens2f0, and so on) instead of the NIC IDs to ensure the overcloud deployment completes successfully.
Created attachment 1235874[details]
/var/log/messages
Description of problem:
As a part of OSPD10 SR-IOV deploymnet, I have configured the compute yaml file (please find it attached) to use nic id's (nic1, nic2, nic3, etc).
The deployment got stuck on step 5 and eventually it fails due to a timeout.
I have established a connection to one of the computes, and found out in "/var/log/messages" that the nics order isn't correct as it should, it have been changed after the creation of the VF's. nic4 should be mapped to ens2f0 and nic5 should be mapped to ens2f1 (see attached image).
I have tried to use nic names (ens1f0, ens1f1, ens2f0, etc) instead of nic id's, and in this case the overcloud deployment finished successfully.
Version-Release number of selected component (if applicable):
OSPD10 - 1 controller, two computes with SR-IOV enabled.
How reproducible:
Always
Steps to Reproduce:
1. deploy ospd with attached yamls.
Actual results:
deployment fails due to timeout
Expected results:
overcloud deploy should finish successfully
Additional info:
Compute hardware, HPE ProLiant DL380 Gen9 server, HPE Ethernet 10Gb 2-port 560SFP sr-iov nic.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHEA-2017:1245
Created attachment 1235874 [details] /var/log/messages Description of problem: As a part of OSPD10 SR-IOV deploymnet, I have configured the compute yaml file (please find it attached) to use nic id's (nic1, nic2, nic3, etc). The deployment got stuck on step 5 and eventually it fails due to a timeout. I have established a connection to one of the computes, and found out in "/var/log/messages" that the nics order isn't correct as it should, it have been changed after the creation of the VF's. nic4 should be mapped to ens2f0 and nic5 should be mapped to ens2f1 (see attached image). I have tried to use nic names (ens1f0, ens1f1, ens2f0, etc) instead of nic id's, and in this case the overcloud deployment finished successfully. Version-Release number of selected component (if applicable): OSPD10 - 1 controller, two computes with SR-IOV enabled. How reproducible: Always Steps to Reproduce: 1. deploy ospd with attached yamls. Actual results: deployment fails due to timeout Expected results: overcloud deploy should finish successfully Additional info: Compute hardware, HPE ProLiant DL380 Gen9 server, HPE Ethernet 10Gb 2-port 560SFP sr-iov nic.