Bug 1320217

Summary: OVS port bindings randomly fail
Product: [Community] RDO Reporter: Alvaro Aleman <alv2412>
Component: openstack-neutronAssignee: lpeer <lpeer>
Status: CLOSED NOTABUG QA Contact: Ofer Blaut <oblaut>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: LibertyCC: chrisw, yeylon
Target Milestone: ---   
Target Release: Kilo   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-07 10:14:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
ifcfg-ovs_bond0
none
ifcfg-enp2s0f1
none
ifcfg-enp2s0f0
none
ifcfg-br-ex
none
neutron server.log
none
neutron l3-agent.log
none
neutron openvswitch-agent
none
ovsdb-server.log
none
ovs-vswitchd.log none

Description Alvaro Aleman 2016-03-22 15:05:41 UTC
Created attachment 1139090 [details]
ifcfg-ovs_bond0

Description of problem:

After adding a lacp ovs bond to br-ex as physical interface, creation of networks sometimes works fine and sometimes ends up with the ports beeing in 'binding_failed' state.

We have a Jenkins job that rebuilds the environment using PXE for OS installation, Packstack for OpenStack installation and some Ansible code for post-packstack configuration, including creation of provider network, router and private network. This works fine one day and ends up with neutron ports being in binding_failed state the other day, on the same commit of the automation code.

We started to see this error since we tried to use a lacp ovs bond as port to the physical network on br-ex. The lacp ovs bond as such can be used for ssh to the node.
Neutron openvswitch agent just reports:

Error received from [ovsdb-client monitor Interface name,ofport,external_ids --format=json]: None
Process [ovsdb-client monitor Interface name,ofport,external_ids --format=json] dies due to the error: None

Attached are the interface configuration files and logs. I will gladly provide any further info/config/logs that may help.


Version-Release number of selected component (if applicable):

yum list installed|egrep 'openvswitch|neutron'
openstack-neutron.noarch             1:7.0.1-1.el7            @openstack-liberty
openstack-neutron-common.noarch      1:7.0.1-1.el7            @openstack-liberty
openstack-neutron-ml2.noarch         1:7.0.1-1.el7            @openstack-liberty
openstack-neutron-openvswitch.noarch 1:7.0.1-1.el7            @openstack-liberty
openvswitch.x86_64                   2.4.0-1.el7              @openstack-liberty
python-neutron.noarch                1:7.0.1-1.el7            @openstack-liberty
python-neutronclient.noarch          3.1.0-1.el7              @openstack-liberty
python-openvswitch.noarch            2.4.0-1.el7              @openstack-liberty

Comment 1 Alvaro Aleman 2016-03-22 15:06:42 UTC
Created attachment 1139091 [details]
ifcfg-enp2s0f1

Comment 2 Alvaro Aleman 2016-03-22 15:07:14 UTC
Created attachment 1139092 [details]
ifcfg-enp2s0f0

Comment 3 Alvaro Aleman 2016-03-22 15:07:36 UTC
Created attachment 1139093 [details]
ifcfg-br-ex

Comment 4 Alvaro Aleman 2016-03-22 15:08:09 UTC
Created attachment 1139094 [details]
neutron server.log

Comment 5 Alvaro Aleman 2016-03-22 15:08:32 UTC
Created attachment 1139095 [details]
neutron l3-agent.log

Comment 6 Alvaro Aleman 2016-03-22 15:09:19 UTC
Created attachment 1139096 [details]
neutron openvswitch-agent

Comment 7 Alvaro Aleman 2016-03-22 15:09:36 UTC
Created attachment 1139097 [details]
ovsdb-server.log

Comment 8 Alvaro Aleman 2016-03-22 15:09:58 UTC
Created attachment 1139098 [details]
ovs-vswitchd.log

Comment 9 Alvaro Aleman 2016-04-07 10:13:35 UTC
Creating a lacp bond before the Packstack run and passing that bond to Packstack via --os-neutron-ovs-bridge-interfaces=br-ex:bond0 made the issue disappear.