Bug 1318816

Summary: Sometimes, ixgbe nics will take up to 40 seconds before they are all up and os-net-config will run before all nics are up
Product: Red Hat OpenStack Reporter: David Hill <dhill>
Component: os-net-configAssignee: RHOS Maint <rhos-maint>
Status: CLOSED NOTABUG QA Contact: Shai Revivo <srevivo>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.0 (Kilo)CC: dhill, dsneddon, dtantsur, hbrock, jcoufal, jslagle, mburns, rhel-osp-director-maint, skinjo
Target Milestone: ---Keywords: Reopened, ZStream
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-10-09 21:16:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description David Hill 2016-03-17 22:50:01 UTC
Description of problem:
Sometimes, ixgbe nics will take up to 40 seconds before they are all up and os-net-config will run before all nics are up and will fail noticeably when trying to map nicX to pXpX.

Version-Release number of selected component (if applicable):


How reproducible:
Random

Steps to Reproduce:
1. Deploy an overcloud using more than 2 ixgbe nics
2.
3.

Actual results:
Fails at network configuration

Expected results:
succeeds

Additional info:
Patch provided by the customer:
--- 20-os-net-config.orig	2016-03-17 18:47:05.634188672 -0400
+++ 20-os-net-config	2016-03-17 18:47:19.111195755 -0400
@@ -51,6 +51,7 @@
 NET_CONFIG=$(os-apply-config --key os_net_config --type raw --key-default '')
 
 if [ -n "$NET_CONFIG" ]; then
+    if [[ $(awk '{printf "%d",$1}' /proc/uptime) -lt 600 ]];then echo "nic delay fix, wait for 60 sec";sleep 60; fi 
     os-net-config -c /etc/os-net-config/config.json -v --detailed-exit-codes
     RETVAL=$?
     if [[ $RETVAL == 2 ]]; then

Comment 2 Mike Burns 2016-04-07 21:14:44 UTC
This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.

Comment 4 Dan Sneddon 2016-10-14 16:09:05 UTC
(In reply to David Hill from comment #0)
> Description of problem:
> Sometimes, ixgbe nics will take up to 40 seconds before they are all up and
> os-net-config will run before all nics are up and will fail noticeably when
> trying to map nicX to pXpX.

I strongly suspect that this is due to switch configuration. If spanning-tree is enabled, and the host ports are not configured to enable fast forwarding, then it can take up to a minute before STP will start forwarding traffic to the port.

The customer should have better results if they configure "portfast" or "edge port" or "host port" on the ports attached directly to servers (name will vary depending on switch vendor).

I also don't believe that this bug would affect newer versions of OSP-Director.

Considering that this is an old ticket, I'm going to close this ticket for now. Please feel free to reopen if the root cause analysis is incorrect.