Bug 962605 - Instance fails to receive DHCP offer
Instance fails to receive DHCP offer
Status: CLOSED CURRENTRELEASE
Product: RDO
Classification: Community
Component: openstack-nova (Show other bugs)
unspecified
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: RHOS Maint
Ami Jeain
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-05-13 22:25 EDT by Matthew Farrellee
Modified: 2015-06-04 17:51 EDT (History)
6 users (show)

See Also:
Fixed In Version: openstack-nova-2013.2-4
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-12-24 06:30:56 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Matthew Farrellee 2013-05-13 22:25:03 EDT
Description of problem:

Almost exactly the same issue described in

https://github.com/mseknibilel/OpenStack-Folsom-Install-guide/issues/14

with a workaround of

iptables -A POSTROUTING -t mangle -p udp --dport 68 -j CHECKSUM --checksum-fill


Version-Release number of selected component (if applicable):

openstack-nova-network-2013.1-3.el6.noarch


How reproducible:

100%


Steps to Reproduce:
1. yum install -y http://rdo.fedorapeople.org/openstack/openstack-grizzly/rdo-release-grizzly-3.noarch.rpm
2. yum install -y openstack-packstack
3. packstack --allinone
4. nova secgroup-add-rule default tcp 22 22 0.0.0.0/0
5. wget http://savanna-files.mirantis.com/savanna-0.1-hdp-img.tar.gz
6. md5sum savanna-0.1-hdp-img.tar.gz -> 4998e403f559e85be1a5955186a5638e
7. tar xzf savanna-0.1-hdp-img.tar.gz
8. glance image-create --name=hdp.image --disk-format=qcow2 --container-format=bare < ./savanna-0.1-hdp-img.img
+------------------+--------------------------------------+
| Property         | Value                                |
+------------------+--------------------------------------+
| checksum         | efce0641f8426e60da94a3d49971dbe0     |
| container_format | bare                                 |
| created_at       | 2013-05-11T13:36:29                  |
| deleted          | False                                |
| deleted_at       | None                                 |
| disk_format      | qcow2                                |
| id               | bbc7d8dc-3320-40f5-8ed8-4ae8271e272b |
| is_public        | False                                |
| min_disk         | 0                                    |
| min_ram          | 0                                    |
| name             | hdp.image                            |
| owner            | 09e1b1abf47a4f0dbeb917bd6ecc32f9     |
| protected        | False                                |
| size             | 1641480192                           |
| status           | active                               |
| updated_at       | 2013-05-11T13:36:41                  |
+------------------+--------------------------------------+
9. Start an instance w/ the hdp.image


Actual results:

. nova list reports instance w/ novanetwork=<priv ip>
. ssh to instance (root/swordfish) fails
. connecting to console via Horizon shows no ip on eth0
. ifup from console shows offers sent, then a checksum error
. /var/log/messages on hypervisor host shows dnsmasq offer requests and offers sent


Expected results:

Instance eth0 up w/ working private IP


Additional info:

Running 'iptables -A POSTROUTING -t mangle -p udp --dport 68 -j CHECKSUM --checksum-fill' on hypervisor host before launching instance resolves the issue.

Note, it is unknown exactly why this is not an issue for http://mattdm.fedorapeople.org/cloud-images/Fedora18-Cloud-x86_64-latest.qcow2
Comment 1 Kashyap Chamarthy 2013-11-04 05:39:02 EST
A related old bug (Folsom time-frame) where I had to investigate why an instance booted from a nova volume wasn't getting DHCP leases

  https://bugzilla.redhat.com/show_bug.cgi?id=929194

This was resolved with the same iptables workaround mentioned in the description. I think this rule should be just documented?

(The above bug was tested with a cirros image, which showed the DHCP leases problem in its serial console log.)
Comment 2 Kashyap Chamarthy 2013-12-16 10:36:45 EST
Matthew,  since you're using RDO, can you try using latest Havana bits and see if you could still reproduce this problem?

This should fetch the latest:
  $ yum install -y http://rdo.fedorapeople.org/rdo-release.rpm

Or you could fetch it independently from here:

  $ http://repos.fedorapeople.org/repos/openstack/openstack-havana/
Comment 3 Matthew Farrellee 2013-12-17 09:27:59 EST
I have not seen this issue with RDO Havana Beta bits. I've not built a cluster w/ the current RDO Havana bits.
Comment 4 Kashyap Chamarthy 2013-12-24 06:30:56 EST
(In reply to Matthew Farrellee from comment #3)
> I have not seen this issue with RDO Havana Beta bits. I've not built a
> cluster w/ the current RDO Havana bits.

I see that you're using Nova network. You might also want to try Neutron networking (as Nova networking will be deprecated in favor of Neutron).

And, just to note: To debug these kind of DHCP offer problems in Neutron, you can create a separate dnsmasq.log:

  $ cat /etc/neutron/dnsmasq.conf 
  log-facility = /var/log/neutron/dnsmasq.log
  log-dhcp

And, ensure the config file path is noted in dhcp_agent.ini

  $ grep dnsmasq_config_file dhcp_agent.ini 
  dnsmasq_config_file = /etc/neutron/dnsmasq.conf

Restart Neutron services to ensure the above takes effect

  $ openstack-service restart neutron


As of now, I'm closing this as FIXED, CURRENTRELEASE. Also, I don't see the above issue with latest RDO havana bits: openstack-nova-2013.2-4

Matthew, feel free to reopen if you hit this issue again with latest Havana RDO bits.

Note You need to log in before you can comment on or make changes to this bug.