Bug 985954 - Floating IPs are assigned one by one and it takes much of time.
Floating IPs are assigned one by one and it takes much of time.
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron (Show other bugs)
3.0
x86_64 Linux
high Severity medium
: rc
: 5.0 (RHEL 7)
Assigned To: Assaf Muller
yfried
: ZStream
: 1028898 (view as bug list)
Depends On:
Blocks: 1028898 1051602
  Show dependency treegraph
 
Reported: 2013-07-18 11:06 EDT by Jaroslav Henner
Modified: 2016-04-26 13:16 EDT (History)
10 users (show)

See Also:
Fixed In Version: 5.0-Beta/2014-05-15.2
Doc Type: Bug Fix
Doc Text:
Previously, floating IP addresses were allocated by Networking, but displayed by Compute. Synchronization was not always timely, and resulted in a delay for the allocation to reflect in Dashboard. with this update, Dashboard receives floating IP allocation data directly from Networking, resulting in faster reflection in the Dashboard view.
Story Points: ---
Clone Of:
: 1028898 1051602 (view as bug list)
Environment:
Last Closed: 2014-07-08 11:33:13 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Launchpad 1194026 None None None Never
Launchpad 1233391 None None None Never
Launchpad 1265032 None None None Never
OpenStack gerrit 64448 None None None Never

  None (edit)
Description Jaroslav Henner 2013-07-18 11:06:57 EDT
Description of problem:
Floating IPs are assigned one by one and it takes much of time.


Version-Release number of selected component (if applicable):
openstack-quantum.noarch             2013.1.2-4.el6ost     @puddle              
openstack-quantum-openvswitch.noarch 2013.1.2-4.el6ost     @puddle              
python-quantum.noarch                2013.1.2-4.el6ost     @puddle              



How reproducible:
1/1

Steps to Reproduce:
1. Boot 10 VMs
2. Assign a floating IP to all of them. It is possible to do that in horizon in about 2 minutes.
3.

Actual results:
When updating the horizon page, one can see the IPs are appearing one by one about in, I don't know exactly, half minute intervals

Expected results:
10 floating IPs pingable and showed in horizon in less than one minute.

Additional info:
Maybe this patch fixes this issue: https://review.openstack.org/#/c/36890/
Bob wanted me to link: https://bugs.launchpad.net/neutron/+bug/1194026
Comment 2 Rami Vaknin 2013-10-03 01:09:10 EDT
I also enconter this bug with the following scnarios:

1. When associating a floating ip to an instance, it takes long time (maybe ~20 seconds) to the newly-associated to appear in the instance's IP Address view.

2. When dissociating a floating ip from and instance, the floating ip remains in the instance's IP Address view although operation succeeded and neither floating is not pingable anymore nor appears in the cli's "nova list" command. Note that moving to another vertical tab and getting back to the "Manage Compute"->"Instances" tab doesn't help, eventually the floating ip usually disappears (it's not consistent), I've tried few times and I got into situation that the floating IP remained for minutes, even when logging out and logging in to Horizon again.
Comment 3 Jaroslav Henner 2013-10-03 17:44:47 EDT
(In reply to Rami Vaknin from comment #2)
> I also enconter this bug with the following scnarios:
> 
> 1. When associating a floating ip to an instance, it takes long time (maybe
> ~20 seconds) to the newly-associated to appear in the instance's IP Address
> view.
> 
> 2. When dissociating a floating ip from and instance, the floating ip
> remains in the instance's IP Address view although operation succeeded and
> neither floating is not pingable anymore nor appears in the cli's "nova
> list" command. Note that moving to another vertical tab and getting back to
> the "Manage Compute"->"Instances" tab doesn't help, eventually the floating
> ip usually disappears (it's not consistent), I've tried few times and I got
> into situation that the floating IP remained for minutes, even when logging
> out and logging in to Horizon again.

2) Is probably caused by nova caching. compare the output of 
nova floating-ip-list
quantum floating-ip-list
Comment 4 lpeer 2013-11-11 02:28:17 EST
*** Bug 1028898 has been marked as a duplicate of this bug. ***
Comment 5 Rami Vaknin 2013-11-17 09:39:32 EST
Failed.

Tested on rhos 4.0, puddle 2013-11-15.1

1. I've created several instances and floating IPs in cli
2. I logged in into horizon and started to assign floating IPs to instances at time 00:00
3. I finished to assign 8 floating IPs at time 1:40 (1 minute and 20 seconds)
4. I saw all 8 floating IPs associated with instance in horizon at time 7:04 which means that the issue is deeper now.
Comment 6 Rami Vaknin 2013-11-17 09:41:00 EST
Forgot to add that I did refresh the horizon every few seconds.
Comment 7 lpeer 2013-11-18 07:23:31 EST
I am interested to understand if this is a Neutron issue or an Horizon one.
Please reproduce it using the Neutron CLI and update if you can reproduce the issue.
Comment 8 Rami Vaknin 2013-11-18 09:33:51 EST
I reproduced it using the cli, it takes a looooot of time to the floating ip to appear in the "nova list" output.
Tested on 4.0/2013-11-15.1.
Comment 9 Brent Eagles 2013-12-02 13:56:41 EST
This seems to be at the core of an upstream test failure for basic network functionality. I've submitted a patch upstream that amends the basic network scenario to "work around" slow assignment, which isn't awesome from the point of view of this bug. However it *does* highlight just how bad it is at the moment.
Comment 10 Brent Eagles 2013-12-17 10:13:07 EST
This is, in a sense, an end-to-end/round-trip performance issue. While it might be mitigated by some small self-contained changes, a rudimentary analysis indicates that it is caused by design and implementation choices. It is not likely something that can be declared *fixed* in an async fix or patch release.

What I would like to do is to push this to 5.0, but in the meantime analyze the timing of each stage of the interactions in operations like this in environments and scenarios where the behavior is *terrible*. If we can isolate the segments that are the worst offenders we can a.) create tests to gauge improvements and avoid performance critical regressions in the future, and b.) identify contributing issues, create bugs, fix them etc. and finally c.) repeat (a.).
Comment 11 Maru Newby 2013-12-17 12:49:28 EST
There is another upstream fix I'm adding a link to that promises to further speed up floating ip allocation.  I haven't tested it though.
Comment 13 Assaf Muller 2013-12-22 11:10:12 EST
On my all-in-one devstack setup:
When assigning a floating IP, I was able to ping it through the DNAT in under a second, however it took roughly 23 seconds for the association to show up in nova list and in Horizon.

This is a very uninformed guess - But this hints to an issue somewhere in Nova, or between Nova and Neutron, and not an issue with Neutron itself.

I'll continue to investigate.
Comment 14 Assaf Muller 2013-12-26 10:23:24 EST
This bug shouldn't happen using Nova Networking (Unverified, but with a very high probability). With Neutron it does.

So what's happening is this:
Horizon uses Neutron directly to associate a floating IP. Neutron doesn't notify Nova that an instance's networking information has changed, and Horizon doesn't update Nova either. However, Horizon takes the instances networking information from Nova.

Nova uses a strange mechanism to update its instances networking information. It is detailed in _heal_instance_info_cache at nova/compute/manager.py:4388. A periodic task (60 seconds by default) is ran, and updates *one* instance's networking information at a time (By making API calls to Neutron). This means that when associating three floating IPs, it can take up to 3 minutes for the last floating IP to be properly displayed in Horizon, and this should scale linearly.

Solutions:
1) Nova's 'heal cache mechanism' could update networking information for *all* instances and not only one. Any instance networking changes will still be reflected in Horizon from 1 to 60 seconds after the operation. I don't think that this is acceptable user experience.
2) Change Horizon so that it takes networking information directly from Neutron. Taking this information from Nova should be avoided in other places as well. Horizon could accomplish this using the same API calls that Nova makes to Neutron in nova/network/neutronv2/api.py:992:_build_network_info_model. Alternatively, we could add a Neutron verb that accepts a port ID and makes the appropriate queries to retrieve all networking information for that port. Horizon could then use that verb on every port on every instance (It would first obtain instance -> ports mappings from Nova). This is simpler and more efficient than how it's currently implemented in Nova.
3) Perhaps implement both solutions so everything else that relies on Nova's cached instance's networking information will benefit.
Comment 15 Assaf Muller 2013-12-30 12:26:33 EST
Filed an upstream bug on Horizon.
Comment 16 Assaf Muller 2013-12-30 12:27:56 EST
Pushed a patch to Horizon to fix the bug.

"nova list" still takes a lot of time to update from Neutron, but that's a separate issue.
Comment 19 lpeer 2014-01-23 07:09:00 EST
After a discussion with Ofer and Assaf we agreed that this fix does not need to be backported to Havana.
Horizon does not reflect the state in Neutron correctly and the information on the floating IP can be pulled from Neutron directly.
Comment 20 Assaf Muller 2014-02-17 12:45:27 EST
Upstream patch merged to master, will be available in I3.
Comment 21 Ofer Blaut 2014-05-27 08:17:34 EDT
Tested on IceHouse 


python-neutronclient-2.3.4-1.el7ost.noarch
openstack-neutron-2014.1-20.el7ost.noarch
openstack-neutron-openvswitch-2014.1-20.el7ost.noarch
python-neutron-2014.1-20.el7ost.noarch
openstack-neutron-ml2-2014.1-20.el7ost.noarch

python-django-horizon-2014.1-6.el7ost.noarch
Comment 23 errata-xmlrpc 2014-07-08 11:33:13 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-0848.html

Note You need to log in before you can comment on or make changes to this bug.