1104082 – Instance doesn't get DHCP offer when using nova network with VLAN manager

Bug 1104082 - Instance doesn't get DHCP offer when using nova network with VLAN manager

Summary: Instance doesn't get DHCP offer when using nova network with VLAN manager

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-nova
Sub Component:
Version:	4.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	async
Target Release:	4.0
Assignee:	Russell Bryant
QA Contact:	Ofer Blaut
Docs Contact:
URL:
Whiteboard:
Depends On:	911005
Blocks:
TreeView+	depends on / blocked

Reported:	2014-06-03 09:03 UTC by Pavel Sedlák
Modified:	2022-07-09 06:52 UTC (History)
CC List:	7 users (show)
Fixed In Version:	openstack-nova-2013.2.3-9.el6ost
Doc Type:	Bug Fix
Doc Text:
Clone Of:	911005
Environment:
Last Closed:	2014-08-21 00:40:22 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
OpenStack gerrit	96732	None	None	None	Never
Red Hat Issue Tracker	OSP-16458	None	None	None	2022-07-09 06:52:37 UTC
Red Hat Product Errata	RHSA-2014:1084	normal	SHIPPED_LIVE	Moderate: openstack-nova security, bug fix, and enhancement update	2014-08-21 04:34:32 UTC

Comment 2 Pavel Sedlák 2014-06-03 09:26:30 UTC

When compute service is present on same node as dnsmasq, dhcpoffer does not reaches the server instance booted on that node.
Reproduced with VLANManager, though it could/should be affecting also FlatDHCP.

Issue here is, that packets don't get checksum filled (as they don't leave via network interface), this does not happens with software emulation as that one handles checksum, so only happens on kvm with nested virtualization or on real hw.

> openstack-nova-api.noarch 2013.2.3-7.el6ost
> openstack-nova-compute.noarch 2013.2.3-7.el6ost
> openstack-nova-network.noarch 2013.2.3-7.el6ost
> python-nova.noarch 2013.2.3-7.el6ost
> python-novaclient.noarch 1:2.15.0-4.el6ost

It should be possible to reproduce it with just one node deployment,
probably with packstack --allinone style setup (+novanetwork).

- boot some image (for ex. cirros)
- check the instance log that it didn't got dhcp replies and is retrying
- - if this does not works, use with vlanmanager setup instead (no need to have it really routed to the network - it's local AIO issue anyway)
- have tcpdump listening on the ifc for that server instance (check it's libvirt xml file)
- - wrap that tcpdump in some bash loop or so, as that ifc will disappear for a short moment
- - like: for x in $(seq 5); do tcpdump -lenx -nn -i vnet1 -s 1500 port bootps or port bootpc; sleep 0.2; done
- hard reboot instance
- see that dhcp request goes out and dhcp offer in
- but still in vm dhcp client is waiting for the reply
(opt. you can also tcpdump inside etc)

Workaround is to deploy iptable rule to enforce filling of checksum like:
> iptables -tmangle -A POSTROUTING -p udp -m udp --dport 68 -j CHECKSUM --checksum-fill
fix already contained in icehouse branch is to do it per corresponding interface.

As this was already fixed in icehouse branch, there is proposed backport at https://review.openstack.org/#/c/96732/ .

Comment 7 errata-xmlrpc 2014-08-21 00:40:22 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2014-1084.html

Note You need to log in before you can comment on or make changes to this bug.