Bug 1409097 - OSPD10 + SR-IOV deployment fails while using nic IDs in the compute yaml.
Summary: OSPD10 + SR-IOV deployment fails while using nic IDs in the compute yaml.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: os-net-config
Version: 10.0 (Newton)
Hardware: All
OS: Linux
unspecified
high
Target Milestone: Upstream M3
: 11.0 (Ocata)
Assignee: Saravanan KR
QA Contact: Yariv
URL:
Whiteboard:
Depends On:
Blocks: 1235009 1416070
TreeView+ depends on / blocked
 
Reported: 2016-12-29 12:59 UTC by Ziv Greenberg
Modified: 2017-05-17 19:53 UTC (History)
17 users (show)

Fixed In Version: os-net-config-6.0.0-3.el7ost
Doc Type: Known Issue
Doc Text:
Currently, the Red Hat OpenStack Platform director 10 with SRIOV overcloud deployment fails when using the NIC IDs (for example, nic1, nic2, nic3 and so on) in the compute.yaml file. As a workaround, you need to use NIC names (for example, ens1f0, ens1f1, ens2f0, and so on) instead of the NIC IDs to ensure the overcloud deployment completes successfully.
Clone Of:
: 1416070 (view as bug list)
Environment:
Last Closed: 2017-05-17 19:53:41 UTC


Attachments (Terms of Use)
/var/log/messages (1.75 MB, text/plain)
2016-12-29 12:59 UTC, Ziv Greenberg
no flags Details
image (40.46 KB, image/png)
2016-12-29 13:01 UTC, Ziv Greenberg
no flags Details
yaml files (4.54 KB, application/x-gzip)
2016-12-29 13:04 UTC, Ziv Greenberg
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1245 normal SHIPPED_LIVE Red Hat OpenStack Platform 11.0 Bug Fix and Enhancement Advisory 2017-05-17 23:01:50 UTC
OpenStack gerrit 415682 None None None 2017-01-24 14:04:12 UTC
Launchpad 1653097 None None None 2016-12-30 05:00:47 UTC

Description Ziv Greenberg 2016-12-29 12:59:37 UTC
Created attachment 1235874 [details]
/var/log/messages

Description of problem:

As a part of OSPD10 SR-IOV deploymnet, I have configured the compute yaml file (please find it attached) to use nic id's (nic1, nic2, nic3, etc).

The deployment got stuck on step 5 and eventually it fails due to a timeout.
I have established a connection to one of the computes, and found out in "/var/log/messages" that the nics order isn't correct as it should, it have been changed after the creation of the VF's. nic4 should be mapped to ens2f0 and nic5 should be mapped to ens2f1 (see attached image).

I have tried to use nic names (ens1f0, ens1f1, ens2f0, etc) instead of nic id's, and in this case the overcloud deployment finished successfully.  



Version-Release number of selected component (if applicable):
OSPD10 - 1 controller, two computes with SR-IOV enabled.


How reproducible:
Always

Steps to Reproduce:
1. deploy ospd with attached yamls.


Actual results:
deployment fails due to timeout

Expected results:
overcloud deploy should finish successfully


Additional info:
Compute hardware, HPE ProLiant DL380 Gen9 server, HPE Ethernet 10Gb 2-port 560SFP sr-iov nic.

Comment 1 Ziv Greenberg 2016-12-29 13:01:08 UTC
Created attachment 1235875 [details]
image

Comment 2 Ziv Greenberg 2016-12-29 13:04:57 UTC
Created attachment 1235876 [details]
yaml files

Comment 3 Saravanan KR 2017-01-18 07:25:06 UTC
Review - https://review.openstack.org/#/c/415682/

Comment 7 Ziv Greenberg 2017-05-17 09:48:13 UTC
Hi,

It was verified.

Thanks,
Ziv

Comment 8 errata-xmlrpc 2017-05-17 19:53:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245


Note You need to log in before you can comment on or make changes to this bug.