Bug 1575026

Summary: Can't PXE/iPXE boot with dnsmasq and DHCPv6
Product: Red Hat Enterprise Linux 7 Reporter: Derek Higgins <derekh>
Component: dnsmasqAssignee: Petr Menšík <pemensik>
Status: CLOSED WONTFIX QA Contact: qe-baseos-daemons
Severity: medium Docs Contact:
Priority: medium    
Version: 7.5CC: bfournie, dsinglet, hjensas, lzap, marjones, mhulan, thozza
Target Milestone: rcKeywords: FutureFeature, Patch, Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1779187 1804382 1810172 (view as bug list) Environment:
Last Closed: 2020-04-21 12:51:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1459187, 1779187, 1780662, 1782947, 1804382, 1810172    
Attachments:
Description Flags
RPM SPEC file with the 3 patches attached to this bug.
none
Patch30
none
Patch31
none
Patch32 none

Description Derek Higgins 2018-05-04 14:56:07 UTC
Description of problem:
While network booting with DHCPv6 and chain loading from PXE to iPXE, dnsmasq refuses to hand out a static IP when the DHCPv6 clid or iaid changes.

This prevents a static ip from being used for PXE or iPXE booting. During the boot process PXE will send out a DHCPREQUEST and get an IP address, then after chainloading to iPXE the clid and iaid change (as they are generated with a different algorithm) and dnsmasq responds with "no addresses available", the same thing will then happen when the OS takes over and another different algorithm is used.

As a test removing the check in check_address (rfc3315.c), gets rid of the problem.

Version-Release number of selected component (if applicable):
dnsmasq-2.76-5.el7.x86_64

How reproducible:
Every time

Steps to Reproduce:
I'm encoutering this problem in openstack but have isolated the problem out to be reproduced with the following configs

==== addn_hosts
fd00:1101::0101 host101
fd00:1101::0102 host102

==== boot.ipxe
#!ipxe
  
set base http://[fd00:1101::0002]
kernel ${base}/vmlinuz initrd=centos_initrd_ipv6.img rdinit=/usr/sbin/init
initrd ${base}/centos_initrd_ipv6.img
boot

==== dnsmasq.conf
dhcp-match=ipxe,175
dhcp-option-force=26,1450

dhcp-userclass=set:ipxe6,iPXE
dhcp-vendorclass=set:ipxe6,HTTP
dhcp-option=tag:!ipxe6,option6:bootfile-url,tftp://[fd00:1101::0002]/ipxe.efi
dhcp-option=tag:ipxe6,option6:bootfile-url,http://[fd00:1101::0002]/ipv6/boot.ipxe

==== host
fa:16:3e:fe:87:61,host101,[fd00:1101::0101]
fa:16:3e:ed:ad:d4,host102,[fd00:1101::0102]

==== run-dnsmasq
#!/bin/bash -x
sudo rm /var/lib/dnsmasq/dnsmasq.leases
sudo dnsmasq \
 -d --log-queries --log-dhcp \
 --no-hosts --no-resolv --strict-order --except-interface=lo --bind-interfaces --interface=eth1 \
 --conf-file=/var/www/html/ipv6/dnsmasq.conf \
 --addn-hosts=/var/www/html/ipv6/addn_hosts \
 --dhcp-hostsfile=/var/www/html/ipv6/host --dhcp-range=set:tag0,fd00:1101::,static,64,86400s

==== /etc/radvd.conf 

interface eth1
{
        AdvManagedFlag on;
        AdvSendAdvert on;
        AdvOtherConfigFlag on;
        MinRtrAdvInterval 3;
        MaxRtrAdvInterval 60;
};



More info can be found here from when I last looked at this problem, 
http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2017q1/011267.html
I'm currently looking at the solution proposed in that thread to make dhcp-host be conditional but in the mean time am filing this bugzilla as others may hit the problem

Actual results:
$ ./run-dnsmasq  |& grep -e client-id -e IAID -e iaaddr -e http -e "no add" -e DHCP
dnsmasq: compile time options: IPv6 GNU-getopt DBus no-i18n IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
dnsmasq-dhcp: DHCPv6, static leases only on fd00:1101::, lease time 1d
dnsmasq-dhcp: DHCP, sockets bound exclusively to interface eth1
dnsmasq-dhcp: 11337181 available DHCPv6 subnet: fd00:1101::/64
dnsmasq-dhcp: 15209603 available DHCPv6 subnet: fd00:1101::/64
dnsmasq-dhcp: 15209603 DHCPSOLICIT(eth1) 00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e:58:a6:03:e4 
dnsmasq-dhcp: 15209603 DHCPADVERTISE(eth1) fd00:1101::102 00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e:58:a6:03:e4 host102
dnsmasq-dhcp: 15209603 sent size: 18 option:  1 client-id  00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e...
dnsmasq-dhcp: 15209603 sent size: 40 option:  3 ia-na  IAID=4018141183 T1=43200 T2=75600
dnsmasq-dhcp: 15209603 nest size: 24 option:  5 iaaddr  fd00:1101::102 PL=86400 VL=86400
dnsmasq-dhcp: 15275139 available DHCPv6 subnet: fd00:1101::/64
dnsmasq-dhcp: 15275139 DHCPSOLICIT(eth1) 00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e:58:a6:03:e4 
dnsmasq-dhcp: 15275139 DHCPADVERTISE(eth1) fd00:1101::102 00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e:58:a6:03:e4 host102
dnsmasq-dhcp: 15275139 sent size: 18 option:  1 client-id  00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e...
dnsmasq-dhcp: 15275139 sent size: 40 option:  3 ia-na  IAID=3980328959 T1=43200 T2=75600
dnsmasq-dhcp: 15275139 nest size: 24 option:  5 iaaddr  fd00:1101::102 PL=86400 VL=86400
dnsmasq-dhcp: 15340675 available DHCPv6 subnet: fd00:1101::/64
dnsmasq-dhcp: 15340675 DHCPREQUEST(eth1) 00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e:58:a6:03:e4 
dnsmasq-dhcp: 15340675 DHCPREPLY(eth1) fd00:1101::102 00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e:58:a6:03:e4 host102
dnsmasq-dhcp: 15340675 sent size: 18 option:  1 client-id  00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e...
dnsmasq-dhcp: 15340675 sent size: 40 option:  3 ia-na  IAID=3980328959 T1=43200 T2=75600
dnsmasq-dhcp: 15340675 nest size: 24 option:  5 iaaddr  fd00:1101::102 PL=86400 VL=86400
dnsmasq-dhcp: 15406211 available DHCPv6 subnet: fd00:1101::/64
dnsmasq-dhcp: 15406211 DHCPREQUEST(eth1) 00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e:58:a6:03:e4 
dnsmasq-dhcp: 15406211 DHCPREPLY(eth1) 00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e:58:a6:03:e4 no addresses available
dnsmasq-dhcp: 15406211 sent size: 18 option:  1 client-id  00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e...
dnsmasq-dhcp: 15406211 sent size: 32 option:  3 ia-na  IAID=4018141183 T1=4294967295 T2=4294967295
dnsmasq-dhcp: 15406211 sent size: 24 option: 13 status  2 no addresses available
dnsmasq-dhcp: 15406211 available DHCPv6 subnet: fd00:1101::/64
dnsmasq-dhcp: 15406211 DHCPREQUEST(eth1) 00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e:58:a6:03:e4 
dnsmasq-dhcp: 15406211 DHCPREPLY(eth1) 00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e:58:a6:03:e4 no addresses available
dnsmasq-dhcp: 15406211 sent size: 18 option:  1 client-id  00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e...
dnsmasq-dhcp: 15406211 sent size: 32 option:  3 ia-na  IAID=4018141183 T1=4294967295 T2=4294967295
dnsmasq-dhcp: 15406211 sent size: 24 option: 13 status  2 no addresses available
dnsmasq-dhcp: 15406211 available DHCPv6 subnet: fd00:1101::/64
dnsmasq-dhcp: 15406211 DHCPREQUEST(eth1) 00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e:58:a6:03:e4 
dnsmasq-dhcp: 15406211 DHCPREPLY(eth1) 00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e:58:a6:03:e4 no addresses available
dnsmasq-dhcp: 15406211 sent size: 18 option:  1 client-id  00:04:7b:ca:f3:ce:16:8f:63:4a:ad:20:fe:7e...
dnsmasq-dhcp: 15406211 sent size: 32 option:  3 ia-na  IAID=4018141183 T1=4294967295 T2=4294967295
dnsmasq-dhcp: 15406211 sent size: 24 option: 13 status  2 no addresses available


Expected results:
A statis IP to be handed out in place of "no addresses available"

Comment 4 Bob Fournier 2019-12-03 14:05:55 UTC
Note I've also seen this issue just with PXE.  The flow is like this:

UEFI PXE client sends a Solicit with DNS option request
Dnsmasq responds with IPv6 address and DNS option response
UEFI PXE client sends a 2nd solicit with BootUrl and BootFileParamaters option requests and a different IAID
Dnsmasq responds with "no addresses available"

Comment 10 Bob Fournier 2019-12-18 15:30:14 UTC
Tomas - unfortunately the fact that the patch is upstream doesn't mean it will get approved and merged.  That patch just hit its 1 year anniversary and the submitter again requested inclusion (see [1]), but the upstream maintainer does not seem to be inclined to include it.

Is there any possibility of including this patch just downstream?  Do we have any precedence for this in dnsmasq?  Thanks.


[1] http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2019q4/013623.html

Comment 11 Tomáš Hozza 2019-12-18 16:11:23 UTC
(In reply to Bob Fournier from comment #10)
> Tomas - unfortunately the fact that the patch is upstream doesn't mean it
> will get approved and merged.  That patch just hit its 1 year anniversary
> and the submitter again requested inclusion (see [1]), but the upstream
> maintainer does not seem to be inclined to include it.
> 
> Is there any possibility of including this patch just downstream?  Do we
> have any precedence for this in dnsmasq?  Thanks.
> 
> 
> [1]
> http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2019q4/013623.html

We recently included one critical fix as a downstream patch, but it is definitely not something that we want to continue doing. dnsmasq's code is hard to maintain and any downstream change makes it even harder to maintain and to keep the functionality in new upstream version. Working with upstream to include a different version of the fix or possibly changing dnsmasq for something different is a preferred way to go. My understanding is that Open Stack runs not only on RHEL, how are other distributions solving this issue?

Comment 12 Harald Jensås 2020-02-19 14:06:32 UTC
Created attachment 1664048 [details]
RPM SPEC file with the 3 patches attached to this bug.

Comment 13 Harald Jensås 2020-02-19 14:07:20 UTC
Created attachment 1664049 [details]
Patch30

Comment 14 Harald Jensås 2020-02-19 14:08:03 UTC
Created attachment 1664050 [details]
Patch31

Comment 15 Harald Jensås 2020-02-19 14:08:59 UTC
Created attachment 1664051 [details]
Patch32

Comment 25 Dana Singleterry 2020-03-27 12:59:53 UTC
Ack to LZaps comment 24. Not needed for RHEL7 but instead for RHEL8. NeedInfo flag reset.