+++ This bug was initially created as a clone of Bug #1283812 +++ Description of problem: My bond0.120 interface has a VLAN=120 line in the /etc/sysconfig/network-scripts/ifcfg-bond0.120 file. I can ifup/down it manually. If I declare the line local_interface=bond0.120 in my undercloud.conf, instack-undercloud will use the config.json.template ( https://github.com/openstack/instack-undercloud/blob/master/elements/undercloud-stack-config/config.json.template ) to generate the /etc/os-net-config/config.json as if it was a standard interface. os-net-config will come later to prepare the br-ctlplane file. To do so, it will regenerate the /etc/sysconfig/network-scripts/ifcfg-bond0.120 file but will miss the VLAN=120 line. I'm not sure this can easily be fixed, but a notice or an error message would be really useful. This is my undercloud.conf: [DEFAULT] local_ip = 192.168.120.1/24 local_interface=bond0.120 masquerade_network = 192.168.120.0/24 dhcp_start = 192.168.120.5 dhcp_end = 192.168.120.24 network_cidr = 192.168.120.0/24 network_gateway = 192.168.120.1 inspection_iprange = 192.168.120.100,192.168.120.120 [auth] instack-undercloud will generate: [root@fv2sah network-scripts]# cat /etc/os-net-config/config.json {"network_config": [{"type": "ovs_bridge", "ovs_extra": ["br-set-external-id br-ctlplane bridge-id br-ctlplane"], "name": "br-ctlplane", "members": [{"type": "interface", "name": "bond0.120", "primary": "true"}], "addresses": [{"ip_netmask": "192.168.120.1/24"}]}]} The bridge member entry is wrong. bond0.120 is a vlan. The regenerated ifcfg-bond0.120 /etc/sysconfig/network-scripts/ifcfg-bond0.120 # This file is autogenerated by os-net-config DEVICE=bond0.120 ONBOOT=yes HOTPLUG=no NM_CONTROLLED=no DEVICETYPE=ovs TYPE=OVSPort OVS_BRIDGE=br-ctlplane BOOTPROTO=none As a result OVS will generate a bogus br-ctlpane: root@fv2sah stack]# ovs-vsctl show 1532e349-c0e3-47b9-839e-d84470094823 Bridge br-ctlplane Port "bond0.120" Interface "bond0.120" error: "could not open network device bond0.120 (No such device)" Port br-ctlplane Interface br-ctlplane type: internal ovs_version: "2.4.0" [root@fv2sah stack]# ifup bond0.120 ERROR : [/etc/sysconfig/network-scripts/ifup-eth] Device bond0.120 does not seem to be present, delaying initialization. ovs-vsctl: Error detected while setting up 'bond0.120'. See ovs-vswitchd log for details. If I add the missing vlan definition in the /etc/sysconfig/network-scripts/ifcfg-bond0.120 file: VLAN=yes I can now do a ifup bond0.120 and ovs now see the interface [root@fv2sah network-scripts]# ovs-vsctl show 1532e349-c0e3-47b9-839e-d84470094823 Bridge br-ctlplane Port "bond0.120" Interface "bond0.120" Port br-ctlplane Interface br-ctlplane type: internal ovs_version: "2.4.0" --- Additional comment from Gonéri Le Bouder on 2015-11-19 21:13:14 EST --- This is the correct os-net-config configuration: {"network_config": [{"type": "ovs_bridge", "ovs_extra": ["br-set-external-id br-ctlplane bridge-id br-ctlplane"], "name": "br-ctlplane", "members": [{"type": "vlan", "name": "bond0", vlan_id: 120}], "addresses": [{"ip_netmask": "192.168.120.1/24"}]}]} --- Additional comment from Gonéri Le Bouder on 2015-11-20 13:58:42 EST --- Ok, it's more complicated than expected. the syntax above allow the creation of a internal OVS VLAN. That's not what we want. We just need the VLAN=true in /etc/sysconfig/network-scripts/ifcfg-bond0.120 and that's something os-net-config cannot do (yet?). --- Additional comment from Gonéri Le Bouder on 2015-11-20 16:01:53 EST --- I'd just submitted https://review.openstack.org/248246 upstream to fix the issue.
Hello, It's really common to see 802.1q VLAN is use and this bug make the situation really painful. The user can spend days to understand why s/he get disconnect in the middle of a bunch of scripts. The problem is very subtle and there is absolutely no error message. Worst, if you don't have a console access to the server, there is no way to get the connection back.
Does this happens for bonding by OVS and for Linux bridging?
Hello Arkady, Yes, if the VLAN is configured on a bonding, the issue remains. This is not exactly a blocker for JetStream because you deploy the undercloud in a VM machine directly plugged on the correct VLAN.
Dan Prince, just assigned this to you. There is an upstream fix merged, can you just make sure it gets backported to stable so we can ship it? Thanks.
I'd just pushed a documentation update for this: https://review.openstack.org/295542
Tested code is included in os-net-config-0.2.2-1.el7ost.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-0604.html