Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1081159 - L3 agent restart causes network outage
L3 agent restart causes network outage
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron (Show other bugs)
4.0
Unspecified Unspecified
high Severity high
: z4
: 4.0
Assigned To: Jakub Libosvar
Ofer Blaut
: ZStream
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2014-03-26 12:37 EDT by Dave Sullivan
Modified: 2018-02-08 05:14 EST (History)
11 users (show)

See Also:
Fixed In Version: openstack-neutron-2013.2.3-4.el6ost
Doc Type: Bug Fix
Doc Text:
Cause: qrouter namespaces were destroyed and recreated during an L3 agent start. Consequence: Ongoing traffic was lost due to missing NAT rules in destroyed namespace. Fix: Namespaces in use are preserved during agent start. Result: Restarting L3 agent has no influence on ongoing traffic via router namespaces.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-05-29 16:19:34 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
lpeer: needinfo+


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Launchpad 1175695 None None None Never
OpenStack gerrit 84420 None None None Never
Red Hat Product Errata RHSA-2014:0516 normal SHIPPED_LIVE Moderate: openstack-neutron security, bug fix, and enhancement update 2014-05-29 20:15:59 EDT

  None (edit)
Description Dave Sullivan 2014-03-26 12:37:18 EDT
Description of problem:

L3 Network Drops - Floating IP's are not Accessible

Even if neutron services are restarted on management node, floating IP's are not accessible.

Tenants need to restart their instances and then things work.  

Need to determine cause of initial l3 outage.

Appears to be an upstream BZ noted here

https://bugs.launchpad.net/neutron/+bug/1175695

Version-Release number of selected component (if applicable):

current RHOS 4
Comment 2 Maru Newby 2014-03-28 19:04:17 EDT
The upstream bug looks like the probably cause.  The next step 
is cherry-picking the fix for inclusion in stable/havana and figuring out if we can rely on the next sync or we need to manually backport to RHOS.
Comment 3 Ofer Blaut 2014-03-31 04:42:11 EDT
I Have tested on Havana A3 using distributed system 

openstack-neutron-2013.2.2-5.el6ost.noarch

1. I have stopped L3 agent 
2. qrouter namespace is still up, and traffic to floating ip works 
3. while starting  L3 agent traffic stops for ~ 25 seconds  and resume later

[root@puma05 ~]# ip netns | grep qrouter
qrouter-69cf3535-2960-4b11-8e3a-da37c3331f01
[root@puma05 ~]# service neutron-l3-agent stop
Stopping neutron-l3-agent:                                 [  OK  ]
[root@puma05 ~]# ip netns | grep qrouter
qrouter-69cf3535-2960-4b11-8e3a-da37c3331f01
[root@puma05 ~]# openstack-status 
== neutron services ==
neutron-server:                         inactive  (disabled on boot)
neutron-dhcp-agent:                     active
neutron-l3-agent:                       inactive
neutron-metadata-agent:                 active
neutron-lbaas-agent:                    inactive  (disabled on boot)
neutron-openvswitch-agent:              active
== Support services ==
openvswitch:                            active
messagebus:                             active
Comment 10 Ofer Blaut 2014-04-22 06:33:30 EDT
tested by service neutron-l3-agent restart

I have run ping -ni 0.01 <floating ip>  and no packet is lost 

openstack-neutron-2013.2.3-4.el6ost.noarch
Comment 12 errata-xmlrpc 2014-05-29 16:19:34 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2014-0516.html

Note You need to log in before you can comment on or make changes to this bug.