Description of problem: Recently, I have been working with the dnsmasq developer on fixing a couple of dhcp6 related problems. In the process, we discovered that there is a problem with the way libvirt runs dnsmasq as a dhcp4 server. This issue became urgent for libvirt when dnsmasq in F17 was updated from 2.59 to 2.63. First, to have this problem, you must be running multiple dnsmasqs with each one supporting a different IPv4 network. While --bind-interfaces is specified, it appears that it will have no effect unless --interface is also specified. The obvious fix is to add --interface <dev> to the command line. The relevnat information can be seen in this message: http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2012q4/006418.html Note: this issue has no effect with respect to dnsmasq's dhcp6 service in that it does handle/filter all packets. I am pasting the the heart of the problem here: ---------------------------------------------- The problem is that when you have more than one instance of dnsmasq doing DHCP. Each instance is listening on *:67. Now, a packet arrives for port 67 on a particular interface. How is the kernel supposed know which instance of dnsmasq to send it to? It can't and sometimes gets it wrong. This is normally masked because DHCP clients fall back to broadcast, and they get sent to _all_ the listeners, (check the bug report I referenced) but there are situations were this fails. For DNS, with --bind-interfaces, there isn't a problem, because when dnsmasq is configured with --interface or --listen-address then port 53 is bound to a particular address, not the wildcard address. DHCP always binds the wildcard address (there are some strange packets in a DHCP exchange that get missed otherwise.) As we've seen, --interface or --listen-address is an access control mechanism in the DHCP code: recieve all packets and filter. The change in 2.61 is that when dnsmasq is configured with exactly one --interface, it calls an obscure Linux-only socket option, SO_BINDTODEVICE on the DHCP socket (which is bound to *:67). That has the effect of getting the right packets to the right dnsmasq instance. It only works for exactly one --interface (otherwise, dnsmasq would have to start handling multiple DHCP sockets - a big change.) The SO_BINDTODEVICE stuff only works with --interface, not --listen-address, hence the desirability of moving libvirt from --listen-address to --interface. THis stuff is all horrible, a legacy of the LSD-inspired Berkeley sockets API. dnsmasq was originally intended to be run as one daemon on a machine, handling multiple interfaces. Adapting to the one-dnsmasq-per-interface paradigm has been a long hard road. ---------------------------------------------------
Created attachment 629650 [details] add --interface to dnsmasq command line I have not had a chance yet to put this through the git process but this patch should correct the problem. Basically, as described by the dnsmasq developer: "The problem is that, without SO_BINDTODEVICE, there is no guarantee that the kernel will route DHCP (v4 or v6) packets to the correct instance of dnsmasq, when there is more than one." The --interface parameter is added to the command line and nothing is removed.
patch submitted to upstream git
(In reply to comment #2) > patch submitted to upstream git https://www.redhat.com/archives/libvir-list/2012-October/msg01042.html
This is being close since it turns out to be not a problem for libvirt. Here is what Simon Kelley (dnsmasq developer) final statement about the problem: http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2012q4/006445.html ----------------------------------------------------------------------- OK, so this is vaguely embarrassing. Having checked the actual code, rather than the changelog, I see that dnsmasq >=2.61 _already_ does the right thing. Setting --bind-interfaces* and a single --listen-address will cause the code to set SO_BINDTODEVICE on the DHCP socket(s). So, there is not a problem with the existing libvirt command line. Gene, apologies for sending you on a wild-goose chase with this. * or bind-dyanmic on 2.63 and later. Cheers, Simon. --------------------------------------------------------------------------