Bug 1316768

Summary: During port creation, tagging of OVS interfaces is intermittently delayed
Product: Red Hat OpenStack Reporter: Matt Flusche <mflusche>
Component: openstack-neutronAssignee: Terry Wilson <twilson>
Status: CLOSED ERRATA QA Contact: Alexander Stafeyev <astafeye>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.0 (Kilo)CC: amuller, chrisw, dmaley, mflusche, nlevinki, nyechiel, scorcora, tfreger, twilson, yeylon
Target Milestone: asyncKeywords: ZStream
Target Release: 7.0 (Kilo)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-neutron-2015.1.2-11.el7ost Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-03-23 14:26:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Matt Flusche 2016-03-11 03:16:02 UTC
Description of problem:
- This issue occurs for only a small portion of the neutron ports.
- When this occurs, as new neutron networks and routers are created there is initially no network connectivity.  
- Looking at the OVS port, the interface tag is missing and it's configuration can be delayed over an hour.

Example:
# ovs-vsctl show | grep -A2 qr-
        Port "qr-1ab96b12-fd"
            Interface "qr-1ab96b12-fd"
                type: internal

- System has a large number of OVS ports (2500+)
- System has a large number of neutron namespaces (1500+ qrouter, qdhcp, qlbaas)
- At the same time this issue occurs the following error is seen openvswitch-agent.log:   TRACE neutron.agent.ovsdb.impl_vsctl Stderr: ovs-vsctl: no row "ha-1a2b3cd4-5e" in table Port

Version-Release number of selected component (if applicable):
openstack-neutron-2015.1.2-9.el7ost.noarch
openstack-neutron-common-2015.1.2-9.el7ost.noarch
openstack-neutron-lbaas-2015.1.2-1.el7ost.noarch
openstack-neutron-ml2-2015.1.2-9.el7ost.noarch
openstack-neutron-openvswitch-2015.1.2-9.el7ost.noarch
openvswitch-2.4.0-1.el7.x86_64
python-neutron-2015.1.2-9.el7ost.noarch
python-neutron-lbaas-2015.1.2-1.el7ost.noarch
python-neutronclient-2.4.0-2.el7ost.noarch
python-openvswitch-2.4.0-1.el7.noarch


How reproducible:
Intermittently on this specific environment


Steps to Reproduce:
1. Unknown
2.
3.

Actual results:
The new neutron networking does not work initially.

Expected results:
Business as usual

Additional info:

To isolate the issue I used a script to process the debug logging for openvswitch-agent.log and show the time when a port is created and when a port 
is tagged.  Below we see that when the issue occurs, tagging is blocked and then all processed at once when the "issue" clears; however, this can be delayed over an hour.


interface      Port Create Time        Port Tag Time
=========      ================        ================
qg-11d2d676-a1 2016-03-10 09:25:08.013 2016-03-10 09:25:15.334
qr-02767e6d-62 2016-03-10 09:25:33.440 2016-03-10 09:25:41.155 <= normal
ha-fa5129f6-a3 2016-03-10 09:32:27.396 2016-03-10 10:41:04.535 <= 1st delay
tap36224cb3-e4 2016-03-10 09:32:32.208 2016-03-10 10:41:08.130
qg-d4f68898-06 2016-03-10 09:32:52.124 2016-03-10 10:41:16.707
qr-0bf27c0c-16 2016-03-10 09:33:16.374 2016-03-10 10:41:10.189
qg-a879651a-77 2016-03-10 09:50:44.804
qg-ebb78d41-38 2016-03-10 09:51:06.242
qg-916985d9-d9 2016-03-10 10:18:34.722 2016-03-10 10:41:00.824
qg-678518a2-0a 2016-03-10 10:22:57.354 2016-03-10 10:41:14.153
tap789355d3-eb 2016-03-10 10:42:10.945 2016-03-10 10:42:22.376 <= back to normal
ha-14342953-ac 2016-03-10 10:42:20.288 2016-03-10 10:42:31.357

The following trace is seen in the debug log.  This same trace has occurred prior to two different instances of this issue that I  have reviewed:


016-03-10 09:31:38.152 538384 TRACE neutron.agent.ovsdb.impl_vsctl Traceback (most recent call last):
2016-03-10 09:31:38.152 538384 TRACE neutron.agent.ovsdb.impl_vsctl   File "/usr/lib/python2.7/site-packages/neutron/agent/ovsdb/impl_vsctl.py", line 63, in run_vsctl
2016-03-10 09:31:38.152 538384 TRACE neutron.agent.ovsdb.impl_vsctl     log_fail_as_error=False).rstrip()
2016-03-10 09:31:38.152 538384 TRACE neutron.agent.ovsdb.impl_vsctl   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 137, in execute
2016-03-10 09:31:38.152 538384 TRACE neutron.agent.ovsdb.impl_vsctl     raise RuntimeError(m)
2016-03-10 09:31:38.152 538384 TRACE neutron.agent.ovsdb.impl_vsctl RuntimeError: 
2016-03-10 09:31:38.152 538384 TRACE neutron.agent.ovsdb.impl_vsctl Command: ['ovs-vsctl', '--timeout=20', '--oneline', '--format=json', '--', '--columns=name,tag', 'list', 'Port', u'ha-010a21e6-ea', u'ha-01d1c7f3-b4', u'ha-01e31de7-ea', u'ha-024feee7-40', u'ha-02c7cf27-08', u'ha-0302a474-54', u'ha-04c821ae-54', u'ha-059b43c2-bf', u'ha-05dd0829-2f', u'ha-05e305b5-93', u'ha-06ab4b84-a9', u'ha-06f6637a-c8', u'ha-07a537b6-55', u'ha-07c1f872-5f', u'ha-07c9a245-48', u'ha-08917388-9d', u'ha-0960ebf8-34', u'ha-0b25a911-21', u'ha-0b61e098-e4', u'ha-0b765af7-3f', u'ha-0ba0ea60-bf', u'ha-0be8cb50-51', u'ha-0ce60df6-c1', u'ha-0d879012-33', u'ha-0f4cdc01-e1', u'ha-0f5466c7-67', u'ha-0f7731d4-70', u'ha-0f7f472c-30', u'ha-0f8bca68-a5', u'ha-100842b0-3a', u'ha-11be3a80-60', u'ha-12226268-18', u'ha-12cbf017-a9', u'ha-13a371f6-6f', u'ha-13c54e4d-56', u'ha-142e10a9-c8', u'ha-145baf22-61', u'ha-15074d67-cd', u'ha-1591d943-41', u'ha-16eb93f2-0e', u'ha-17fecdd1-7f', u'ha-18c9cba3-63', u'ha-197f6a7d-c5', u'ha-1a437629-ec', u'ha-1ae23a36-9f', u'ha-1ae85bae-6d', u'ha-1b69d132-55', u'ha-1b7f0ec9-d3', u'ha-1be18470-f9', u'ha-1c25acf1-f2', u'ha-1cd46510-64', u'ha-1e68f2be-5e', u'ha-1e8d0601-0c', u'ha-1ea7c555-af', u'ha-1efff1a1-b7', u'ha-1fbc2d53-57', u'ha-202086cc-e3',
..... +many, many more ports....


2016-03-10 09:31:38.152 538384 TRACE neutron.agent.ovsdb.impl_vsctl Exit code: 1
2016-03-10 09:31:38.152 538384 TRACE neutron.agent.ovsdb.impl_vsctl Stdin: 
2016-03-10 09:31:38.152 538384 TRACE neutron.agent.ovsdb.impl_vsctl Stdout: 
2016-03-10 09:31:38.152 538384 TRACE neutron.agent.ovsdb.impl_vsctl Stderr: ovs-vsctl: no row "ha-9a8e1db4-1f" in table Port
2016-03-10 09:31:38.152 538384 TRACE neutron.agent.ovsdb.impl_vsctl 
2016-03-10 09:31:38.152 538384 TRACE neutron.agent.ovsdb.impl_vsctl 
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent Traceback (most recent call last):
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py", line 1596, in rpc_loop
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent     port_info = self.scan_ports(reg_ports, updated_ports_copy)
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py", line 1111, in scan_ports
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent     updated_ports.update(self.check_changed_vlans(registered_ports))
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py", line 1138, in check_changed_vlans
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent     port_tags = self.int_br.get_port_tag_dict()
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/neutron/agent/common/ovs_lib.py", line 399, in get_port_tag_dict
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent     results = cmd.execute(check_error=True)
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/neutron/agent/ovsdb/impl_vsctl.py", line 83, in execute
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent     txn.add(self)
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/neutron/agent/ovsdb/api.py", line 68, in __exit__
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent     self.result = self.commit()
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/neutron/agent/ovsdb/impl_vsctl.py", line 50, in commit
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent     res = self.run_vsctl(args)
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/neutron/agent/ovsdb/impl_vsctl.py", line 70, in run_vsctl
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent     ctxt.reraise = False
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent     six.reraise(self.type_, self.value, self.tb)
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/neutron/agent/ovsdb/impl_vsctl.py", line 63, in run_vsctl
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent     log_fail_as_error=False).rstrip()
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 137, in execute
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent     raise RuntimeError(m)
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent RuntimeError: 
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent Command: ['ovs-vsctl', '--timeout=20', '--oneline', '--format=json', '--', '--columns=name,tag', 'list', 'Port', u'ha-010a21e6-ea', u'ha-01d1c7f3-b4', u'ha-01e31de7-ea', u'ha-024feee7-40', u'ha-02c7cf27-08', u'ha-0302a474-54', u'ha-04c821ae-54', u'ha-059b43c2-bf', u'ha-05dd0829-2f', u'ha-05e305b5-93', u'ha-06ab4b84-a9', u'ha-06f6637a-c8', u'ha-07a537b6-55', u'ha-07c1f872-5f'
....

45fa4ed-1c', u'tapf4cc5fb8-90', u'tapf551ccf0-37', u'tapf59a13e5-69', u'tapf60cc09a-6d', u'tapf6651036-b8', u'tapf7168a41-fa', u'tapf7a30f3a-74', u'tapf7e6c35c-05', u'tapf827b64f-d9', u'tapf860757f-6f', u'tapf8b69ceb-d0', u'tapf8e48426-84', u'tapf92544a4-e4', u'tapf96399ed-69', u'tapf9815703-af', u'tapf9a036df-9a', u'tapf9c97e52-cb', u'tapf9cc4ba8-6a', u'tapfa1d5362-1b', u'tapfac9ac9f-41', u'tapfc93201a-0c', u'tapfd050d93-a0', u'tapfd2979c1-4a', u'tapfd65ccc9-76', u'tapfd75e02c-eb', u'tapfd803799-11', u'tapfd8f674e-b4', u'tapfda5a4e2-ef', u'tapfda7e3c7-57', u'tapfdd218ee-c6', u'tapfe6f86a9-a3', u'tapfef423c4-00', u'tapffb74bb4-77']
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent Exit code: 1
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent Stdin: 
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent Stdout: 
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent Stderr: ovs-vsctl: no row "ha-9a8e1db4-1f" in table Port
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent 
2016-03-10 09:31:38.154 538384 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent

Comment 9 nlevinki 2016-03-22 10:23:48 UTC
This is an hotfix
rpm is openstack-neutron-common-2015.1.2-11.el7ost.noarch

automation passed
https://rhos-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/RHOS/view/RHOS7/job/rhos-jenkins-rhos-7.0-puddle-rhel-7.2-3networkers-packstack-neutron-ml2-vxlan-rabbitmq-tempest-git-all/34/

Comment 11 errata-xmlrpc 2016-03-23 14:26:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0499.html