Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 985954

Summary:	Floating IPs are assigned one by one and it takes much of time.
Product:	Red Hat OpenStack	Reporter:	Jaroslav Henner <jhenner>
Component:	openstack-neutron	Assignee:	Assaf Muller <amuller>
Status:	CLOSED ERRATA	QA Contact:	yfried
Severity:	medium	Docs Contact:
Priority:	high
Version:	3.0	CC:	apevec, beagles, chrisw, lpeer, mlopes, mnewby, oblaut, sclewis, slong, yeylon
Target Milestone:	rc	Keywords:	ZStream
Target Release:	5.0 (RHEL 7)
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	5.0-Beta/2014-05-15.2	Doc Type:	Bug Fix
Doc Text:	Previously, floating IP addresses were allocated by Networking, but displayed by Compute. Synchronization was not always timely, and resulted in a delay for the allocation to reflect in Dashboard. with this update, Dashboard receives floating IP allocation data directly from Networking, resulting in faster reflection in the Dashboard view.	Story Points:	---
Clone Of:
Clones:	1028898 1051602 (view as bug list)		Environment:
Last Closed:	2014-07-08 15:33:13 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1028898, 1051602

Description Jaroslav Henner 2013-07-18 15:06:57 UTC

Description of problem:
Floating IPs are assigned one by one and it takes much of time.


Version-Release number of selected component (if applicable):
openstack-quantum.noarch             2013.1.2-4.el6ost     @puddle              
openstack-quantum-openvswitch.noarch 2013.1.2-4.el6ost     @puddle              
python-quantum.noarch                2013.1.2-4.el6ost     @puddle              



How reproducible:
1/1

Steps to Reproduce:
1. Boot 10 VMs
2. Assign a floating IP to all of them. It is possible to do that in horizon in about 2 minutes.
3.

Actual results:
When updating the horizon page, one can see the IPs are appearing one by one about in, I don't know exactly, half minute intervals

Expected results:
10 floating IPs pingable and showed in horizon in less than one minute.

Additional info:
Maybe this patch fixes this issue: https://review.openstack.org/#/c/36890/
Bob wanted me to link: https://bugs.launchpad.net/neutron/+bug/1194026

Comment 2 Rami Vaknin 2013-10-03 05:09:10 UTC

I also enconter this bug with the following scnarios:

1. When associating a floating ip to an instance, it takes long time (maybe ~20 seconds) to the newly-associated to appear in the instance's IP Address view.

2. When dissociating a floating ip from and instance, the floating ip remains in the instance's IP Address view although operation succeeded and neither floating is not pingable anymore nor appears in the cli's "nova list" command. Note that moving to another vertical tab and getting back to the "Manage Compute"->"Instances" tab doesn't help, eventually the floating ip usually disappears (it's not consistent), I've tried few times and I got into situation that the floating IP remained for minutes, even when logging out and logging in to Horizon again.

Comment 3 Jaroslav Henner 2013-10-03 21:44:47 UTC

(In reply to Rami Vaknin from comment #2)
> I also enconter this bug with the following scnarios:
> 
> 1. When associating a floating ip to an instance, it takes long time (maybe
> ~20 seconds) to the newly-associated to appear in the instance's IP Address
> view.
> 
> 2. When dissociating a floating ip from and instance, the floating ip
> remains in the instance's IP Address view although operation succeeded and
> neither floating is not pingable anymore nor appears in the cli's "nova
> list" command. Note that moving to another vertical tab and getting back to
> the "Manage Compute"->"Instances" tab doesn't help, eventually the floating
> ip usually disappears (it's not consistent), I've tried few times and I got
> into situation that the floating IP remained for minutes, even when logging
> out and logging in to Horizon again.

2) Is probably caused by nova caching. compare the output of 
nova floating-ip-list
quantum floating-ip-list

Comment 4 lpeer 2013-11-11 07:28:17 UTC

*** Bug 1028898 has been marked as a duplicate of this bug. ***

Comment 5 Rami Vaknin 2013-11-17 14:39:32 UTC

Failed.

Tested on rhos 4.0, puddle 2013-11-15.1

1. I've created several instances and floating IPs in cli
2. I logged in into horizon and started to assign floating IPs to instances at time 00:00
3. I finished to assign 8 floating IPs at time 1:40 (1 minute and 20 seconds)
4. I saw all 8 floating IPs associated with instance in horizon at time 7:04 which means that the issue is deeper now.

Comment 6 Rami Vaknin 2013-11-17 14:41:00 UTC

Forgot to add that I did refresh the horizon every few seconds.

Comment 7 lpeer 2013-11-18 12:23:31 UTC

I am interested to understand if this is a Neutron issue or an Horizon one.
Please reproduce it using the Neutron CLI and update if you can reproduce the issue.

Comment 8 Rami Vaknin 2013-11-18 14:33:51 UTC

I reproduced it using the cli, it takes a looooot of time to the floating ip to appear in the "nova list" output.
Tested on 4.0/2013-11-15.1.

Comment 9 Brent Eagles 2013-12-02 18:56:41 UTC

This seems to be at the core of an upstream test failure for basic network functionality. I've submitted a patch upstream that amends the basic network scenario to "work around" slow assignment, which isn't awesome from the point of view of this bug. However it *does* highlight just how bad it is at the moment.

Comment 10 Brent Eagles 2013-12-17 15:13:07 UTC

This is, in a sense, an end-to-end/round-trip performance issue. While it might be mitigated by some small self-contained changes, a rudimentary analysis indicates that it is caused by design and implementation choices. It is not likely something that can be declared *fixed* in an async fix or patch release.

What I would like to do is to push this to 5.0, but in the meantime analyze the timing of each stage of the interactions in operations like this in environments and scenarios where the behavior is *terrible*. If we can isolate the segments that are the worst offenders we can a.) create tests to gauge improvements and avoid performance critical regressions in the future, and b.) identify contributing issues, create bugs, fix them etc. and finally c.) repeat (a.).

Comment 11 Maru Newby 2013-12-17 17:49:28 UTC

There is another upstream fix I'm adding a link to that promises to further speed up floating ip allocation.  I haven't tested it though.

Comment 13 Assaf Muller 2013-12-22 16:10:12 UTC

On my all-in-one devstack setup:
When assigning a floating IP, I was able to ping it through the DNAT in under a second, however it took roughly 23 seconds for the association to show up in nova list and in Horizon.

This is a very uninformed guess - But this hints to an issue somewhere in Nova, or between Nova and Neutron, and not an issue with Neutron itself.

I'll continue to investigate.

Comment 14 Assaf Muller 2013-12-26 15:23:24 UTC

This bug shouldn't happen using Nova Networking (Unverified, but with a very high probability). With Neutron it does.

So what's happening is this:
Horizon uses Neutron directly to associate a floating IP. Neutron doesn't notify Nova that an instance's networking information has changed, and Horizon doesn't update Nova either. However, Horizon takes the instances networking information from Nova.

Nova uses a strange mechanism to update its instances networking information. It is detailed in _heal_instance_info_cache at nova/compute/manager.py:4388. A periodic task (60 seconds by default) is ran, and updates *one* instance's networking information at a time (By making API calls to Neutron). This means that when associating three floating IPs, it can take up to 3 minutes for the last floating IP to be properly displayed in Horizon, and this should scale linearly.

Solutions:
1) Nova's 'heal cache mechanism' could update networking information for *all* instances and not only one. Any instance networking changes will still be reflected in Horizon from 1 to 60 seconds after the operation. I don't think that this is acceptable user experience.
2) Change Horizon so that it takes networking information directly from Neutron. Taking this information from Nova should be avoided in other places as well. Horizon could accomplish this using the same API calls that Nova makes to Neutron in nova/network/neutronv2/api.py:992:_build_network_info_model. Alternatively, we could add a Neutron verb that accepts a port ID and makes the appropriate queries to retrieve all networking information for that port. Horizon could then use that verb on every port on every instance (It would first obtain instance -> ports mappings from Nova). This is simpler and more efficient than how it's currently implemented in Nova.
3) Perhaps implement both solutions so everything else that relies on Nova's cached instance's networking information will benefit.

Comment 15 Assaf Muller 2013-12-30 17:26:33 UTC

Filed an upstream bug on Horizon.

Comment 16 Assaf Muller 2013-12-30 17:27:56 UTC

Pushed a patch to Horizon to fix the bug.

"nova list" still takes a lot of time to update from Neutron, but that's a separate issue.

Comment 19 lpeer 2014-01-23 12:09:00 UTC

After a discussion with Ofer and Assaf we agreed that this fix does not need to be backported to Havana.
Horizon does not reflect the state in Neutron correctly and the information on the floating IP can be pulled from Neutron directly.

Comment 20 Assaf Muller 2014-02-17 17:45:27 UTC

Upstream patch merged to master, will be available in I3.

Comment 21 Ofer Blaut 2014-05-27 12:17:34 UTC

Tested on IceHouse 


python-neutronclient-2.3.4-1.el7ost.noarch
openstack-neutron-2014.1-20.el7ost.noarch
openstack-neutron-openvswitch-2014.1-20.el7ost.noarch
python-neutron-2014.1-20.el7ost.noarch
openstack-neutron-ml2-2014.1-20.el7ost.noarch

python-django-horizon-2014.1-6.el7ost.noarch

Comment 23 errata-xmlrpc 2014-07-08 15:33:13 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-0848.html