Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1034822

Summary:

nova instances not acquiring dhcp IP address during launch

Product:

Red Hat OpenStack

Reporter:

Richard Smith <rismith>

Component:

openstack-neutron

Assignee:

Terry Wilson <twilson>

Status:

CLOSED WORKSFORME

QA Contact:

Ofer Blaut <oblaut>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

4.0

CC:

chrisw, kburres, lpeer, mnewby, rismith, twilson, yeylon

Target Milestone:

Keywords:

Unconfirmed, ZStream

Target Release:

4.0

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2014-02-13 05:34:31 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Packstack answer file	none

Description Richard Smith 2013-11-26 14:55:30 UTC

Description of problem:
Launching nova instances on a neutron subnet with DHCP enabled results in the instances coming up without an IP address assigned to its single interface.  A subsequent  ifup <interface> fixes the problem.

Version-Release number of selected component (if applicable):
RHEL OpenStack havana, namely:
  rhel openstack-neutron-2013.2-5
  rhel openstack-nova-compute-2013.2-4

How reproducible:
almost always


Steps to Reproduce:
1. Packstack installation of one physical controller and two physical nova hosts, with neutron networking,
2. Upload cirros 0.3.0.x86_64 image into glance
3. Launch an instance with m1.tiny, cirros image
4. Login to new instance via console, ifconfig eth0 shows no IP assigned
5. ifup eth0 to get an IP, then ping gateway works

Actual results:
ifconfig eth0 shows no IP address assigned to the interface

Expected results:
Upon booting, the nova instance should acquire an IP address for its internal interface from the DHCP agent


Additional info:

Comment 2 Aleksandr Brezhnev 2013-11-26 15:46:43 UTC

This issue happens when the new VM is the first VM on the virtual network.
Neutron's dhcp-agent does not start dnsmasq daemon for the network if the network does not have DHCP clients. However, when the first VM appears, it looks like sometimes it may be initiated faster then the dnsmasq daemon. As a result, nobody answers its DHCP requests.

It looks like a race condition for me.

Comment 3 Terry Wilson 2013-11-26 15:49:43 UTC

Richard, if I remember correctly, there are issues with the cirros 0.3.0 image re: DHCP on RHEL. Please try the 0.3.1 image ( http://download.cirros-cloud.net/0.3.1/cirros-0.3.1-x86_64-disk.img ) and report back. If that doesn't fix the issue, please describe your setup (vlan, gre, etc.) and post your packstack answer file. Thanks.

Comment 4 Richard Smith 2013-12-02 23:24:47 UTC

Created attachment 831838 [details]
Packstack answer file

This is the original answer file used to create this Openstack cloud.

Comment 5 Richard Smith 2013-12-02 23:26:18 UTC

I downloaded the cirros.0.3.1 image from the link provided and retried the process above - same essential results;  create a new private network, launch 20 m1.tiny images, login on instance console, about 50% of the nodes had no IP assigned to their eth0 interface. I deleted the instances and tried again to launch only 10 instances and still the same results with some large portion of nodes getting no assigned IPs.

neutron-dhcp-agent reports dhcpoffers to all the instances.

Attaching our packstack answer-file which describes a gre setup, one controller, three nova hosts.

Comment 6 Maru Newby 2013-12-09 09:03:28 UTC

I'm wondering if the problem you are reporting with the 0.3.1 image are related to another bug (https://bugzilla.redhat.com/show_bug.cgi?id=1034822) also related to dhcp agent reliability.

Can you replicate the reported problem when booting a single VM?  

What is the load on the controller node - and more importantly, what is the cpu usage of the neutron service - when you are attempting to boot multiple vm's?

Comment 7 Maru Newby 2013-12-09 09:13:00 UTC

The correct link to the other bug is https://bugzilla.redhat.com/show_bug.cgi?id=1023818

Comment 8 Terry Wilson 2014-02-12 22:21:55 UTC

Using cirros 0.3.1, and packstack-based install of 2013.2.1-4 using gre and two compute hosts, every time I boot with lots of instances everything ends up getting a dhcp address. I just haven't been able to reproduce this.

Is it possible that there is an interaction with VLANs going on as well in this set up similar to what was just reported in https://bugzilla.redhat.com/show_bug.cgi?id=1064109? I don't see any information that would point me to that in the answer file provided, but maybe some manual configuration was done later? Or can you try to reproduce with 2013.2.1-4 and cirros 0.3.1? Thanks.

Comment 9 Richard Smith 2014-02-12 23:37:00 UTC

The original problem occured at a client site which I do not have access to.  The tenant network was attached manually to a vlan-tagged bonded interface, although we were using GRE across it.  At one point during that engagement, the problem disappeared and I was unable to reproduce it.  The circumstances were not unlike 1064109, but I cannot confirm it, but neither has the client reported the problem since.

Comment 10 Terry Wilson 2014-02-13 05:34:31 UTC

Going ahead and closing this as WORKSFORME. Sounds like there is at least a decent chance that this is related to https://bugzilla.redhat.com/show_bug.cgi?id=1064109, so we can just continue to track that there since it has a lot of information on that bug. Thanks!