Bug 1290568 - local_interface=bond0.120 in undercloud.conf create broken network configuration
local_interface=bond0.120 in undercloud.conf create broken network configuration
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: os-net-config (Show other bugs)
8.0 (Liberty)
Unspecified Unspecified
urgent Severity unspecified
: ga
: 8.0 (Liberty)
Assigned To: RHOS Maint
Ofer Blaut
: Triaged
Depends On: 1283812
Blocks: 1261979
  Show dependency treegraph
 
Reported: 2015-12-10 15:06 EST by Gonéri Le Bouder
Modified: 2016-04-07 17:43 EDT (History)
20 users (show)

See Also:
Fixed In Version: os-net-config-0.1.6-1.el7ost
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1283812
Environment:
Last Closed: 2016-04-07 17:43:39 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
OpenStack gerrit 248246 None None None 2016-02-25 13:50 EST

  None (edit)
Description Gonéri Le Bouder 2015-12-10 15:06:22 EST
+++ This bug was initially created as a clone of Bug #1283812 +++

Description of problem:

My bond0.120 interface has a VLAN=120 line in the /etc/sysconfig/network-scripts/ifcfg-bond0.120 file. I can ifup/down it manually.

If I declare the line local_interface=bond0.120 in my undercloud.conf, instack-undercloud will use the config.json.template ( https://github.com/openstack/instack-undercloud/blob/master/elements/undercloud-stack-config/config.json.template ) to generate the /etc/os-net-config/config.json as if it was a standard interface.
os-net-config will come later to prepare the br-ctlplane file. To do so, it will regenerate the /etc/sysconfig/network-scripts/ifcfg-bond0.120 file but will miss the VLAN=120 line.

I'm not sure this can easily be fixed, but a notice or an error message would be really useful.


This is my undercloud.conf:
[DEFAULT]
local_ip = 192.168.120.1/24
local_interface=bond0.120
masquerade_network = 192.168.120.0/24
dhcp_start = 192.168.120.5
dhcp_end = 192.168.120.24
network_cidr = 192.168.120.0/24
network_gateway = 192.168.120.1
inspection_iprange = 192.168.120.100,192.168.120.120
[auth]

instack-undercloud will generate:
[root@fv2sah network-scripts]# cat /etc/os-net-config/config.json 
{"network_config": [{"type": "ovs_bridge", "ovs_extra": ["br-set-external-id br-ctlplane bridge-id br-ctlplane"], "name": "br-ctlplane", "members": [{"type": "interface", "name": "bond0.120", "primary": "true"}], "addresses": [{"ip_netmask": "192.168.120.1/24"}]}]}


The bridge member entry is wrong. bond0.120 is a vlan.


The regenerated ifcfg-bond0.120 /etc/sysconfig/network-scripts/ifcfg-bond0.120
# This file is autogenerated by os-net-config
DEVICE=bond0.120
ONBOOT=yes
HOTPLUG=no
NM_CONTROLLED=no
DEVICETYPE=ovs
TYPE=OVSPort
OVS_BRIDGE=br-ctlplane
BOOTPROTO=none

As a result OVS will generate a bogus br-ctlpane:

root@fv2sah stack]# ovs-vsctl show
1532e349-c0e3-47b9-839e-d84470094823
    Bridge br-ctlplane
        Port "bond0.120"
            Interface "bond0.120"
                error: "could not open network device bond0.120 (No such device)"
        Port br-ctlplane
            Interface br-ctlplane
                type: internal
    ovs_version: "2.4.0"
[root@fv2sah stack]# ifup bond0.120
ERROR    : [/etc/sysconfig/network-scripts/ifup-eth] Device bond0.120 does not seem to be present, delaying initialization.
ovs-vsctl: Error detected while setting up 'bond0.120'.  See ovs-vswitchd log for details.


If I add the missing vlan definition in the /etc/sysconfig/network-scripts/ifcfg-bond0.120 file:
VLAN=yes
I can now do a ifup bond0.120 and ovs now see the interface

[root@fv2sah network-scripts]# ovs-vsctl show
1532e349-c0e3-47b9-839e-d84470094823
    Bridge br-ctlplane
        Port "bond0.120"
            Interface "bond0.120"
        Port br-ctlplane
            Interface br-ctlplane
                type: internal
    ovs_version: "2.4.0"

--- Additional comment from Gonéri Le Bouder on 2015-11-19 21:13:14 EST ---

This is the correct os-net-config configuration:
{"network_config": [{"type": "ovs_bridge", "ovs_extra": ["br-set-external-id br-ctlplane bridge-id br-ctlplane"], "name": "br-ctlplane", "members": [{"type": "vlan", "name": "bond0", vlan_id: 120}], "addresses": [{"ip_netmask": "192.168.120.1/24"}]}]}

--- Additional comment from Gonéri Le Bouder on 2015-11-20 13:58:42 EST ---

Ok, it's more complicated than expected. the syntax above allow the creation of a internal OVS VLAN. That's not what we want. We just need the VLAN=true in /etc/sysconfig/network-scripts/ifcfg-bond0.120 and that's something os-net-config cannot do (yet?).

--- Additional comment from Gonéri Le Bouder on 2015-11-20 16:01:53 EST ---

I'd just submitted https://review.openstack.org/248246 upstream to fix the issue.
Comment 2 Gonéri Le Bouder 2016-01-18 11:28:17 EST
Hello,

It's really common to see 802.1q VLAN is use and this bug make the situation really painful. The user can spend days to understand why s/he get disconnect in the middle of a bunch of scripts. The problem is very subtle and there is absolutely no error message. Worst, if you don't have a console access to the server, there is no way to get the connection back.
Comment 3 arkady kanevsky 2016-01-18 13:01:09 EST
Does this happens for bonding by OVS and for Linux bridging?
Comment 4 Gonéri Le Bouder 2016-01-18 13:29:50 EST
Hello Arkady,

Yes, if the VLAN is configured on a bonding, the issue remains.

This is not exactly a blocker for JetStream because you deploy the undercloud in a VM machine directly plugged on the correct VLAN.
Comment 6 Hugh Brock 2016-02-05 07:23:34 EST
Dan Prince, just assigned this to you. There is an upstream fix merged, can you just make sure it gets backported to stable so we can ship it? Thanks.
Comment 8 Gonéri Le Bouder 2016-03-21 18:09:49 EDT
I'd just pushed a documentation update for this: https://review.openstack.org/295542
Comment 10 Ofer Blaut 2016-04-04 10:08:43 EDT
Tested code is included in os-net-config-0.2.2-1.el7ost.noarch
Comment 12 errata-xmlrpc 2016-04-07 17:43:39 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0604.html

Note You need to log in before you can comment on or make changes to this bug.