1335277 – dhcp_release isn't ran on a originating compute when live migrating a VM from computeA to computeB

Bug 1335277 - dhcp_release isn't ran on a originating compute when live migrating a VM from computeA to computeB

Summary: dhcp_release isn't ran on a originating compute when live migrating a VM from...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-nova
Sub Component:
Version:	5.0 (RHEL 7)
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	5.0 (RHEL 7)
Assignee:	Artom Lifshitz
QA Contact:	Prasanth Anbalagan
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1371636 1371664 1371673 1371699
TreeView+	depends on / blocked

Reported:	2016-05-11 18:27 UTC by David Hill
Modified:	2019-12-16 05:46 UTC (History)
CC List:	14 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1371636 1371664 1371673 1371699 (view as bug list)
Environment:
Last Closed:	2017-02-03 16:38:12 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Launchpad	1585601	0	None	None	None	2016-05-25 12:04:15 UTC
OpenStack gerrit	325361	0	None	None	None	2016-06-03 15:24:19 UTC

Description David Hill 2016-05-11 18:27:14 UTC

Description of problem:
dhcp_release isn't ran on the originating compute when live migrating a VM from computeA to computeB and if dhcp_lease_time is set to a high value, the lease will never expire and the original compute will retain the DHCP lease in dnsmasq which will fail to re-allocate the same IP on the originating compute once the original VM is destroyed

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1) launch a VM with a known IP on node 1
2) live-migrate the VM away to node 2
3) delete the VM while it's still on node 2
4) launch a new VM on node 1 using the same IP

Node 1 refuses to give the IP to the new VM, thinking that the IP is owned by the old VM.

It's a live-migration bug. After a VM is live-migrated to node 2, the lease cache on node 1 is not cleared.

This is usually not an issue with the default 2-min dhcp lease time. But this environment, dhcp_lease_time is set to 604800s (or 7 days).

Actual results:
IP should be free

Expected results:
IP is still in the lease cache of dnsmasq

Additional info:

Comment 1 Artom Lifshitz 2016-05-24 23:10:23 UTC

Hello,

Fist of all, just to make sure, can we explicitly confirm that nova-network is in use here and not Neutron?

I've reproduced what I think is the same behaviour in Nova upstream master. 

1. Boot an instance
2. Live-migrate it
2. Delete it
3. Boot another instance with the same IP

This fails with "Fixed IP address is already in use on instance"

As a control, I tried:

1. Boot an instance
2. Delete it
3. Boot another instance with the same IP

This succeeds.

However, I'm not sure it has anything to do with the DHCP lease not being released. Rather, it seems live-migrating an instance somehow causes its fixed IPs to remain associate with the deleted instance in the database even if the instance itself has been deleted. To confirm this, would it be possible to attach sosreports to this BZ?

If we confirm I've indeed observed the same behaviour in Nova master as you're seeing in RHOS 5 I'll need to submit an upstream bugfix and then do a downstream-only backport to RHOS 5, as Icehouse is no longer supported upstream.

Cheers!

Comment 2 David Hill 2016-05-24 23:13:21 UTC

Hello sir,

   I can confirm it is openstack-nova-network that is being used and that killing dnsmasq and restarting nova-network solves this issue.

Thank you very much,

David Hill

Note You need to log in before you can comment on or make changes to this bug.