Bug 1409097

Summary: OSPD10 + SR-IOV deployment fails while using nic IDs in the compute yaml.
Product: Red Hat OpenStack Reporter: Ziv Greenberg <zgreenbe>
Component: os-net-configAssignee: Saravanan KR <skramaja>
Status: CLOSED ERRATA QA Contact: Yariv <yrachman>
Severity: high Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: achernet, dbecker, dnavale, fbaudin, hbrock, jjung, jschluet, jslagle, mburns, morazi, oblaut, rhel-osp-director-maint, skramaja, slinaber, supadhya, vchundur, yrachman
Target Milestone: Upstream M3Keywords: Triaged
Target Release: 11.0 (Ocata)   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: os-net-config-6.0.0-3.el7ost Doc Type: Known Issue
Doc Text:
Currently, the Red Hat OpenStack Platform director 10 with SRIOV overcloud deployment fails when using the NIC IDs (for example, nic1, nic2, nic3 and so on) in the compute.yaml file. As a workaround, you need to use NIC names (for example, ens1f0, ens1f1, ens2f0, and so on) instead of the NIC IDs to ensure the overcloud deployment completes successfully.
Story Points: ---
Clone Of:
: 1416070 (view as bug list) Environment:
Last Closed: 2017-05-17 19:53:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1235009, 1416070    
Attachments:
Description Flags
/var/log/messages
none
image
none
yaml files none

Description Ziv Greenberg 2016-12-29 12:59:37 UTC
Created attachment 1235874 [details]
/var/log/messages

Description of problem:

As a part of OSPD10 SR-IOV deploymnet, I have configured the compute yaml file (please find it attached) to use nic id's (nic1, nic2, nic3, etc).

The deployment got stuck on step 5 and eventually it fails due to a timeout.
I have established a connection to one of the computes, and found out in "/var/log/messages" that the nics order isn't correct as it should, it have been changed after the creation of the VF's. nic4 should be mapped to ens2f0 and nic5 should be mapped to ens2f1 (see attached image).

I have tried to use nic names (ens1f0, ens1f1, ens2f0, etc) instead of nic id's, and in this case the overcloud deployment finished successfully.  



Version-Release number of selected component (if applicable):
OSPD10 - 1 controller, two computes with SR-IOV enabled.


How reproducible:
Always

Steps to Reproduce:
1. deploy ospd with attached yamls.


Actual results:
deployment fails due to timeout

Expected results:
overcloud deploy should finish successfully


Additional info:
Compute hardware, HPE ProLiant DL380 Gen9 server, HPE Ethernet 10Gb 2-port 560SFP sr-iov nic.

Comment 1 Ziv Greenberg 2016-12-29 13:01:08 UTC
Created attachment 1235875 [details]
image

Comment 2 Ziv Greenberg 2016-12-29 13:04:57 UTC
Created attachment 1235876 [details]
yaml files

Comment 3 Saravanan KR 2017-01-18 07:25:06 UTC
Review - https://review.openstack.org/#/c/415682/

Comment 7 Ziv Greenberg 2017-05-17 09:48:13 UTC
Hi,

It was verified.

Thanks,
Ziv

Comment 8 errata-xmlrpc 2017-05-17 19:53:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245