Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1104082

Summary: Instance doesn't get DHCP offer when using nova network with VLAN manager
Product: Red Hat OpenStack Reporter: Pavel Sedlák <psedlak>
Component: openstack-novaAssignee: Russell Bryant <rbryant>
Status: CLOSED ERRATA QA Contact: Ofer Blaut <oblaut>
Severity: high Docs Contact:
Priority: medium    
Version: 4.0CC: ajeain, ndipanov, oblaut, sgordon, tdunnon, vpopovic, yeylon
Target Milestone: asyncKeywords: OtherQA, ZStream
Target Release: 4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-nova-2013.2.3-9.el6ost Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 911005 Environment:
Last Closed: 2014-08-21 00:40:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 911005    
Bug Blocks:    

Comment 2 Pavel Sedlák 2014-06-03 09:26:30 UTC
When compute service is present on same node as dnsmasq, dhcpoffer does not reaches the server instance booted on that node.
Reproduced with VLANManager, though it could/should be affecting also FlatDHCP.

Issue here is, that packets don't get checksum filled (as they don't leave via network interface), this does not happens with software emulation as that one handles checksum, so only happens on kvm with nested virtualization or on real hw.

> openstack-nova-api.noarch 2013.2.3-7.el6ost
> openstack-nova-compute.noarch 2013.2.3-7.el6ost
> openstack-nova-network.noarch 2013.2.3-7.el6ost
> python-nova.noarch 2013.2.3-7.el6ost
> python-novaclient.noarch 1:2.15.0-4.el6ost

It should be possible to reproduce it with just one node deployment,
probably with packstack --allinone style setup (+novanetwork).

- boot some image (for ex. cirros)
- check the instance log that it didn't got dhcp replies and is retrying
- - if this does not works, use with vlanmanager setup instead (no need to have it really routed to the network - it's local AIO issue anyway)
- have tcpdump listening on the ifc for that server instance (check it's libvirt xml file)
- - wrap that tcpdump in some bash loop or so, as that ifc will disappear for a short moment
- - like: for x in $(seq 5); do tcpdump -lenx -nn -i vnet1 -s 1500 port bootps or port bootpc; sleep 0.2; done
- hard reboot instance
- see that dhcp request goes out and dhcp offer in
- but still in vm dhcp client is waiting for the reply
(opt. you can also tcpdump inside etc)

Workaround is to deploy iptable rule to enforce filling of checksum like:
> iptables -tmangle -A POSTROUTING -p udp -m udp --dport 68 -j CHECKSUM --checksum-fill
fix already contained in icehouse branch is to do it per corresponding interface.

As this was already fixed in icehouse branch, there is proposed backport at https://review.openstack.org/#/c/96732/ .

Comment 7 errata-xmlrpc 2014-08-21 00:40:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2014-1084.html