Bug 1732887 - TripleO net config fails to set control plane network on the proper NIC [NEEDINFO]
Summary: TripleO net config fails to set control plane network on the proper NIC
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo
Version: 15.0 (Stein)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: James Slagle
QA Contact: Arik Chernetsky
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-24 15:38 UTC by Yogev Rabl
Modified: 2019-09-10 17:31 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-10 17:31:34 UTC
Target Upstream Version:
bfournie: needinfo? (yrabl)


Attachments (Terms of Use)
Compute node nic topology (8.57 KB, text/plain)
2019-07-24 15:38 UTC, Yogev Rabl
no flags Details
compute node nic configuration file (9.90 KB, text/plain)
2019-07-24 15:38 UTC, Yogev Rabl
no flags Details

Description Yogev Rabl 2019-07-24 15:38:08 UTC
Created attachment 1593179 [details]
Compute node nic topology

Description of problem:

An overcloud deployment fails when TripleO sets the networks on the wrong NICs on the compute node. 
The compute node has multiple connected NICs (attached) in its disposal, TripleO keeps setting the control plane on the first NIC in the list, causing network disconnection between it and the undercloud. 
When attempting to set the BIOS dev name of the NICs in the compute NIC configuration file(attached) the result is the same.

The topology of the overcloud is: 1 controller & 1 compute.
The deployment command is:
openstack overcloud deploy \
--timeout 100 \
--templates /usr/share/openstack-tripleo-heat-templates \
--stack overcloud \
--libvirt-type kvm \
--ntp-server 10.35.255.6 \
-e /home/stack/virt/config_lvm.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/virt/network/network-environment.yaml \
-e /home/stack/virt/network/dvr-override.yaml \
-e /home/stack/virt/inject-trust-anchor.yaml \
-e /home/stack/virt/hostnames.yml \
-e /home/stack/virt/debug.yaml \
-e /home/stack/virt/nodes_data.yaml \
-e ~/containers-prepare-parameter.yaml \
--log-file overcloud_deployment_100.log

Version-Release number of selected component (if applicable):
Container image tag: 20190715.1
puppet-tripleo-10.4.2-0.20190701160408.ecbec17.el8ost.noarch
openstack-tripleo-puppet-elements-10.3.1-0.20190614132452.79c0c76.el8ost.noarch
python3-tripleoclient-heat-installer-11.4.1-0.20190705110410.14ae053.el8ost.noarch
python3-tripleo-common-10.8.1-0.20190710191707.b6a2d65.el8ost.noarch
python3-tripleoclient-11.4.1-0.20190705110410.14ae053.el8ost.noarch
openstack-tripleo-common-10.8.1-0.20190710191707.b6a2d65.el8ost.noarch
openstack-tripleo-heat-templates-10.6.1-0.20190713150434.2871ce0.el8ost.noarch
openstack-tripleo-common-containers-10.8.1-0.20190710191707.b6a2d65.el8ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. set the compute.yaml file with specific nic configuration
2. deploy the overcloud

Actual results:
deployment fails

Expected results:
Deployment is successful, TripleO sets the nics according to the configuration set.

Additional info:

Comment 1 Yogev Rabl 2019-07-24 15:38:45 UTC
Created attachment 1593180 [details]
compute node nic configuration file

Comment 2 Bob Fournier 2019-07-24 17:37:56 UTC
Its not clear what you are seeing and what the problem is.  Is p7p1 not being used for the nic on the control plane? If not, which nic is being used?

Please provide an sosreport when the problem occurs.

Please provide the files used in the deployment, specifically /home/stack/virt/network/network-environment.yaml.

If possible, please provide /etc/os-net-config/config.json on the compute node.

Are you able to successfully introspect this node?  if so please provide the output of "openstack baremetal introspection data save <node>" so we can see the nics available on the node.

Comment 3 Yogev Rabl 2019-07-24 18:55:50 UTC
(In reply to Bob Fournier from comment #2)
> Its not clear what you are seeing and what the problem is.  Is p7p1 not
> being used for the nic on the control plane? If not, which nic is being used?
> 
> Please provide an sosreport when the problem occurs.
> 
> Please provide the files used in the deployment, specifically
> /home/stack/virt/network/network-environment.yaml.
> 
> If possible, please provide /etc/os-net-config/config.json on the compute
> node.
> 
> Are you able to successfully introspect this node?  if so please provide the
> output of "openstack baremetal introspection data save <node>" so we can see
> the nics available on the node.

The nic that is being used is em3. It is not easy to get the /etc/os-net-config/config.json from the compute node cause we have no way to reach it.

I was able to introspect the node successfully, and the first file attached what you asked for

Comment 4 Dan Sneddon 2019-07-26 18:04:13 UTC
(In reply to Yogev Rabl from comment #3)
> The nic that is being used is em3. It is not easy to get the
> /etc/os-net-config/config.json from the compute node cause we have no way to
> reach it.
> 
> I was able to introspect the node successfully, and the first file attached
> what you asked for

I do not see any reason why em3 would be given an IP address via os-net-config if the NIC config provided is actually what is being assigned to the role.

Can you please upload a copy of the /home/stack/virt/network/network-environment.yaml file and /home/stack/virt/nodes_data.yaml file?

Comment 5 Bob Fournier 2019-08-21 19:10:54 UTC
Yogev - can we get the info Dan requested in Comment 4?

Comment 6 Bob Fournier 2019-09-10 17:31:34 UTC
Closing this for now, please reopen with requested info if it occurs again.


Note You need to log in before you can comment on or make changes to this bug.