Bug 1378530 - Booting VM with a Floating IP and pinging it via that takes a long time with errors in L3-Agent logs when using DVR
Summary: Booting VM with a Floating IP and pinging it via that takes a long time with ...
Keywords:
Status: CLOSED DUPLICATE of bug 1363661
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ga
: 10.0 (Newton)
Assignee: Terry Wilson
QA Contact: Toni Freger
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-22 16:31 UTC by Sai Sindhur Malleni
Modified: 2020-05-14 15:19 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-10-13 18:12:32 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1625333 0 None None None 2016-09-22 16:32:04 UTC

Description Sai Sindhur Malleni 2016-09-22 16:31:03 UTC
Description of problem:
A Rally test to launch a VM, attach a floating IP and ping the VM via the floating IP 40 times in case of legacy routers vs DVR routers was done for comparison. Time taken to create network,subnet, launch VM, attach floating IP etc. are similar in legacy and DVR cases but for the VM to be pingable via the floating ip(after it has been booted with floating ip) it takes a lot more time in some iterations with DVR. The VM is ping ready(after booting and being given a floating ip) in less than a second not counting time to boot or attach floating ip in case of Legacy. However in case of DVR sometimes we see the VM being ping ready in less than 1 second whereas in some cases it takes around 250 seconds. Digging into the L3-agent logs on the computes we see this for the instances that were taking the most time to be pingable via the floating ip 
https://paste.fedoraproject.org/431117/74312098/

Version-Release number of selected component (if applicable):


How reproducible:
Happens intermittently. Suppose we create 40 FIPs happens in about 10 of them.

Steps to Reproduce:
1. Create DVR router, attach subnet
2. Launch VM on subnet
3. Attach FIP and ping

Rally-Plugin we used: https://github.com/openstack/browbeat/tree/master/rally/rally-plugins/netcreate-boot-ping

Actual results:
In some cases it was taking ~200 seconds for VM to be pingable via FIP. Correlating the FIPs that were taking a long time to L3-agent logs on computes, we see

2016-09-19 18:58:52.675 23696 DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'fip-790354c7-f286-4fd1-a4a1-ec9749c61fbf', 'arping', '-A', '-I', 'fg-6b5906d0-d9', '-c', '3', '-w', '4.5', '10.16.30.99'] execute_rootwrap_daemon /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:99
2016-09-19 18:58:52.696 23696 ERROR neutron.agent.linux.utils [-] Exit code: 2; Stdin: ; Stdout: ; Stderr: bind: Cannot assign requested address

2016-09-19 18:58:52.697 23696 ERROR neutron.agent.linux.ip_lib [-] Failed sending gratuitous ARP to 10.16.30.99 on fg-6b5906d0-d9 in namespace fip-790354c7-f286-4fd1-a4a1-ec9749c61fbf

Rally-Plugin results:
http://8.43.86.1:8088/smalleni/20160919-172902-browbeat-netcreate-boot-ping-10-iteration-0.html#/BrowbeatPlugin.create_network_nova_boot_ping/details
The green spikes show ierations where it was taking a long time for VM to be pingable.

Expected results:
It should be pingable in reasonable small amount of time after FIp association. We see values less 1s for legacy routers.

Additional info:

Comment 2 Sai Sindhur Malleni 2016-10-05 20:01:18 UTC
Terry,
I still have the environment with me and can reproduce this. I might not have the environment for very long. Please let me know if you want to look. Happy to help if it makes things easier.

Comment 3 Nir Yechiel 2016-10-13 18:12:32 UTC

*** This bug has been marked as a duplicate of bug 1363661 ***


Note You need to log in before you can comment on or make changes to this bug.