Bug 1243109 - Discovery fails if multiple interfaces on provisioning network
Summary: Discovery fails if multiple interfaces on provisioning network
Keywords:
Status: CLOSED DUPLICATE of bug 1411696
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 11.0 (Ocata)
Assignee: Dan Sneddon
QA Contact: Shai Revivo
URL:
Whiteboard:
: 1244906 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-07-14 19:18 UTC by Dan Sneddon
Modified: 2019-08-15 04:52 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
Discovery fails if multiple network interfaces on a node are connected to the Provisioning network. Only one interface can connect to the Provisioning network. This interface cannot be part of a bond.
Clone Of:
Environment:
Last Closed: 2017-12-19 17:42:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Dan Sneddon 2015-07-14 19:18:56 UTC
Description of problem:
If multiple interfaces are attached to the ctlplane, discovery times out randomly.

Version-Release number of selected component (if applicable):
All versions as of 2015-07-14

How reproducible:
100%

Steps to Reproduce:
1. Put both eth0 and eth1 on provisioning net
2. Add MAC address from eth0 to instackenv.json
3. Launch discovery

Actual results:
Both interfaces will request DHCP addresses, and the undercloud responds to both interfaces with an IP address. The interface that the system chooses to communicate with the ctlplane may not be the one we want (the one with the MAC address in instackenv.json). Since the traffic to the undercloud is coming from a different IP address than expected, discovery times out.

Expected results:
Discovery should work, even if mutliple interfaces are attached to the provisioning network. This will allow situations where we provision on one interface, then place that interface into a bond. It will also help with virt testing, since we can't currently use multiple interfaces and bonding in virt environments.

Additional info:
If we could somehow modify the dnsmasq that ironic-discoverd uses to only respond to known MAC addresses (from instackenv.json), I think that would clear up this behavior. I'm not sure if that is feasible.

Comment 4 Dan Sneddon 2015-07-20 18:35:48 UTC
Note that this bug affects virt environments (meaning that we can't test bonding in virt), but it also affects BM environments.

If you have a system with only two 10Gb NICs, you should be able to provision off on one interface, then bond both interfaces together and add the other networks as VLANs. Unfortunately, discovery fails in this configuration, because we can't guarantee the system will use the same interface that it booted from (and Ironic expects this).

So we need to fix this bug for baremetal as well as virt, because it widens the range of supported hardware in an HA environment (right now we require at least 3 nics + IPMI interface).

Comment 5 Mike Burns 2015-07-21 11:58:41 UTC
*** Bug 1244906 has been marked as a duplicate of this bug. ***

Comment 6 Mike Burns 2016-04-07 20:43:53 UTC
This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.

Comment 10 Dmitry Tantsur 2016-10-03 08:55:43 UTC
Dan, can you still reproduce this bug? I assumed it was related to the old iPXE ROM. Now that we're shipping the new iPXE ROM from Jan 2016 in OSPd >= 8, this bug may be gone.

Comment 11 Dan Sneddon 2016-10-04 17:29:29 UTC
(In reply to Dmitry Tantsur from comment #10)
> Dan, can you still reproduce this bug? I assumed it was related to the old
> iPXE ROM. Now that we're shipping the new iPXE ROM from Jan 2016 in OSPd >=
> 8, this bug may be gone.

My environment has mitigations in place to ensure that I don't hit this bug. I'll remove those mitigations and test to make sure I am no longer seeing this behavior and will comment here with results.

Comment 12 Dmitry Tantsur 2016-10-17 09:27:56 UTC
If the issue is still there, it's unlikely to be an easy fix. And we have a work around (meh) in place. So pushing it out of Newton, but let's keep track of it in Ocata for real.

Comment 16 Dmitry Tantsur 2016-10-20 10:02:15 UTC
I have a gut feeling that this is actually related: https://bugs.launchpad.net/puppet-ironic/+bug/1635191. Dan, what do you think?

Comment 17 Dan Sneddon 2017-04-06 01:29:33 UTC
@dtantsur I think it is likely that this bug is related to the Ironic bug. I think we need to retest, since I think it is possible that this issue has been cleared up.

Comment 22 Bob Fournier 2017-12-19 17:42:06 UTC

*** This bug has been marked as a duplicate of bug 1411696 ***


Note You need to log in before you can comment on or make changes to this bug.