Bug 1335277 - dhcp_release isn't ran on a originating compute when live migrating a VM from computeA to computeB
Summary: dhcp_release isn't ran on a originating compute when live migrating a VM from...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 5.0 (RHEL 7)
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 5.0 (RHEL 7)
Assignee: Artom Lifshitz
QA Contact: Prasanth Anbalagan
URL:
Whiteboard:
Depends On:
Blocks: 1371636 1371664 1371673 1371699
TreeView+ depends on / blocked
 
Reported: 2016-05-11 18:27 UTC by David Hill
Modified: 2019-12-16 05:46 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1371636 1371664 1371673 1371699 (view as bug list)
Environment:
Last Closed: 2017-02-03 16:38:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1585601 0 None None None 2016-05-25 12:04:15 UTC
OpenStack gerrit 325361 0 None None None 2016-06-03 15:24:19 UTC

Description David Hill 2016-05-11 18:27:14 UTC
Description of problem:
dhcp_release isn't ran on the originating compute when live migrating a VM from computeA to computeB and if dhcp_lease_time is set to a high value, the lease will never expire and the original compute will retain the DHCP lease in dnsmasq which will fail to re-allocate the same IP on the originating compute once the original VM is destroyed

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1) launch a VM with a known IP on node 1
2) live-migrate the VM away to node 2
3) delete the VM while it's still on node 2
4) launch a new VM on node 1 using the same IP

Node 1 refuses to give the IP to the new VM, thinking that the IP is owned by the old VM.

It's a live-migration bug. After a VM is live-migrated to node 2, the lease cache on node 1 is not cleared.

This is usually not an issue with the default 2-min dhcp lease time. But this environment, dhcp_lease_time is set to 604800s (or 7 days).

Actual results:
IP should be free

Expected results:
IP is still in the lease cache of dnsmasq

Additional info:

Comment 1 Artom Lifshitz 2016-05-24 23:10:23 UTC
Hello,

Fist of all, just to make sure, can we explicitly confirm that nova-network is in use here and not Neutron?

I've reproduced what I think is the same behaviour in Nova upstream master. 

1. Boot an instance
2. Live-migrate it
2. Delete it
3. Boot another instance with the same IP

This fails with "Fixed IP address is already in use on instance"

As a control, I tried:

1. Boot an instance
2. Delete it
3. Boot another instance with the same IP

This succeeds.

However, I'm not sure it has anything to do with the DHCP lease not being released. Rather, it seems live-migrating an instance somehow causes its fixed IPs to remain associate with the deleted instance in the database even if the instance itself has been deleted. To confirm this, would it be possible to attach sosreports to this BZ?

If we confirm I've indeed observed the same behaviour in Nova master as you're seeing in RHOS 5 I'll need to submit an upstream bugfix and then do a downstream-only backport to RHOS 5, as Icehouse is no longer supported upstream.

Cheers!

Comment 2 David Hill 2016-05-24 23:13:21 UTC
Hello sir,

   I can confirm it is openstack-nova-network that is being used and that killing dnsmasq and restarting nova-network solves this issue.

Thank you very much,

David Hill


Note You need to log in before you can comment on or make changes to this bug.