Bug 884957
| Summary: | guest can not get NAT IP from dnsmasq-2.48-10 | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Huang Wenlong <whuang> | ||||||||||||||
| Component: | dnsmasq | Assignee: | Tomáš Hozza <thozza> | ||||||||||||||
| Status: | CLOSED ERRATA | QA Contact: | qe-baseos-daemons | ||||||||||||||
| Severity: | high | Docs Contact: | |||||||||||||||
| Priority: | high | ||||||||||||||||
| Version: | 6.4 | CC: | acathrow, amahdal, azelinka, bugproxy, cwei, dallan, dyasny, dyuan, eblake, gnichols, jbrier, jdenemar, jscotka, kdudka, kraxel, laine, mzhan, ovasik, poelstra, rwu | ||||||||||||||
| Target Milestone: | rc | Keywords: | Patch | ||||||||||||||
| Target Release: | --- | ||||||||||||||||
| Hardware: | x86_64 | ||||||||||||||||
| OS: | Linux | ||||||||||||||||
| Whiteboard: | |||||||||||||||||
| Fixed In Version: | dnsmasq-2.48-13.el6 | Doc Type: | Bug Fix | ||||||||||||||
| Doc Text: |
This Bug was caused by backported patch for Bug #882251.
In the end I used different approach. So this Bug does not
need to be documented. dnsmasq-2.48-10.el6.x86_64 was
never distributed to the customer.
|
Story Points: | --- | ||||||||||||||
| Clone Of: | Environment: | ||||||||||||||||
| Last Closed: | 2013-02-21 10:44:59 UTC | Type: | Bug | ||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
| Embargoed: | |||||||||||||||||
| Bug Depends On: | |||||||||||||||||
| Bug Blocks: | 804141, 888457 | ||||||||||||||||
| Attachments: |
|
||||||||||||||||
|
Description
Huang Wenlong
2012-12-07 07:26:30 UTC
Could you attach the XML definitions for both domain and network used to trigger this bug? Hi,Jiri I used the most simple xml in this case ,I will attach the guest and net xml Wenlong Created attachment 660659 [details]
domain xml
Created attachment 660660 [details]
default network xml
The dnsmasq process does quit when start the domain via libvirt but libvirt do not know that. # ps -ef |grep dns root 17366 8976 0 18:04 pts/0 00:00:00 grep dns [root@intel-w3520-12-2 rpms]# virsh net-list Name State Autostart Persistent -------------------------------------------------- default active yes yes restart libvirtd can start dnsmasq successed [root@intel-w3520-12-2 rpms]# /etc/init.d/libvirtd restart Stopping libvirtd daemon: [ OK ] Starting libvirtd daemon: [ OK ] [root@intel-w3520-12-2 rpms]# virsh net-list Name State Autostart Persistent -------------------------------------------------- default active yes yes [root@intel-w3520-12-2 rpms]# ps -ef |grep dns nobody 17499 1 0 18:05 ? 00:00:00 /usr/sbin/dnsmasq --strict-order --local=// --domain-needed --pid-file=/var/run/libvirt/network/default.pid --conf-file= --bind-dynamic --interface virbr0 --dhcp-range 192.168.122.2,192.168.122.254 --dhcp-leasefile=/var/lib/libvirt/dnsmasq/default.leases --dhcp-lease-max=253 --dhcp-no-override --dhcp-hostsfile=/var/lib/libvirt/dnsmasq/default.hostsfile --addn-hosts=/var/lib/libvirt/dnsmasq/default.addnhosts root 17555 8976 0 18:05 pts/0 00:00:00 grep dns then start a guest , dnsmasq process will quit . tail /var/log/message Dec 10 18:07:06 intel-w3520-12-2 dnsmasq[17499]: failed to bind listening socket for ::1: Address already in use Dec 10 18:07:06 intel-w3520-12-2 dnsmasq[17499]: FAILED to start up I think this is a bug in --bind-dynamic implementation of dnsmasq-2.48-10.el6. The only difference between dnsmasq command line generated by libvirt libvirt-0.10.2-10.el6 and libvirt-0.10.2-11.el6 is that the former uses --bind-interfaces while the latter uses --bind-dynamic when used with dnsmasq-2.48-10.el6. Created attachment 661611 [details]
strace of the failing dnsmasq
This strace collected by Peter Krempa shows that dnsmasq is attempting to bind to "::1" twice - succeeding the first time, but failing the 2nd time.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release. *** Bug 886641 has been marked as a duplicate of this bug. *** dnsmasq --version already comes with a line that looks like: Compile time options IPv6 GNU-getopt DBus no-I18N DHCP TFTP adding a new "option" to this line that appears only when the CVE is fixed will work for libvirt. Unfortunately, the mere act of 'yum reinstall dnsmasq -y' for dnsmasq-2.48-11.el6.x86_64 runs a 'killall dnsmasq', which fries the dnsmasq instances being run by libvirtd. This is unacceptable behavior, as it kills network connectivity of guests that libvirt is managing. I'm moving this back to ASSIGNED to make sure we get that fixed (although it might be worth spawning into another BZ to have this one just track the CVE fix). In the same vein, even though 'chkconfig --list dnsmasq' on my system shows: dnsmasq 0:off 1:off 2:off 3:off 4:off 5:off 6:off the act of upgrading dnsmasq started a global /usr/sbin/dnsmasq process with no command line arguments. A global dnsmasq should only be started if the service is enabled, and not merely because a newer dnsmasq was installed. See bug 850944 for the issues mentioned in comments 37 and 38. *** Bug 887928 has been marked as a duplicate of this bug. *** *** Bug 886682 has been marked as a duplicate of this bug. *** *** Bug 892448 has been marked as a duplicate of this bug. *** Created attachment 674948 [details]
sosreport and var log messages
Created attachment 674949 [details]
RHEL 6.4 guest xml
------- Comment From onmahaja.com 2013-01-09 03:33 EDT------- I also observed this fact. I can sometimes see these lines in /var/log/messages Jan 8 09:51:23 localhost dnsmasq[2615]: failed to bind listening socket for ::1: Address already in use Jan 8 09:51:23 localhost dnsmasq[2615]: FAILED to start up But this also happens with other addresses assigned to virbr0 Check this out - Jan 7 12:58:18 oc2826874472 dnsmasq[3098]: failed to bind listening socket for 192.168.254.1: Address already in use Jan 7 12:58:18 oc2826874472 dnsmasq[3098]: FAILED to start up Jan 8 15:26:29 oc2826874472 dnsmasq[3193]: failed to bind listening socket for 192.168.122.1 : Address already in use Jan 8 15:26:29 oc2826874472 dnsmasq[3193]: FAILED to start up and consequently libvirt fails to start the dnsmasq daemon - and hence the guest DHCP queries are not responded - there is clearly something wrong with the dnsmasq '--interface' option which binds to specified interface ( in this case virbr0) . dnsmasq daemon fails to start in src/network.c :create_bound_listeners() Investigating the reasons ... ------- Comment From onmahaja.com 2013-01-09 03:58 EDT------- As mentioned in comment #23 libvirt fails to start the dnsmasq daemon in 2013-01-07 01:58:34.930+0000: 15544: error : virCommandWait:2345 : internal error Child process (/usr/sbin/dnsmasq --strict-order --local=// --domain-needed --pid-file=/var/run/libvirt/network/default.pid --conf-file= --bind-dynamic --interface virbr0 --dhcp-range 192.168.122.2,192.168.122.254 --dhcp-leasefile=/var/lib/libvirt/dnsmasq/default.leases --dhcp-lease-max=253 --dhcp-no-override --dhcp-hostsfile=/var/lib/libvirt/dnsmasq/default.hostsfile --addn-hosts=/var/lib/libvirt/dnsmasq/default.addnhosts) unexpected exit status 2: dnsmasq: failed to bind listening socket for ::1: Address already in use ------- Comment From onmahaja.com 2013-01-09 04:13 EDT------- As mentioned in comment #23 libvirt fails to start the dnsmasq daemon in 2013-01-07 01:58:34.930+0000: 15544: error : virCommandWait:2345 : internal error Child process (/usr/sbin/dnsmasq --strict-order --local=// --domain-needed --pid-file=/var/run/libvirt/network/default.pid --conf-file= --bind-dynamic --interface virbr0 --dhcp-range 192.168.122.2,192.168.122.254 --dhcp-leasefile=/var/lib/libvirt/dnsmasq/default.leases --dhcp-lease-max=253 --dhcp-no-override --dhcp-hostsfile=/var/lib/libvirt/dnsmasq/default.hostsfile --addn-hosts=/var/lib/libvirt/dnsmasq/default.addnhosts) unexpected exit status 2: dnsmasq: failed to bind listening socket for ::1: Address already in use ------- Comment From onmahaja.com 2013-01-09 04:14 EDT------- As mentioned in comment #23 libvirt fails to start the dnsmasq daemon in 2013-01-07 01:58:34.930+0000: 15544: error : virCommandWait:2345 : internal error Child process (/usr/sbin/dnsmasq --strict-order --local=// --domain-needed --pid-file=/var/run/libvirt/network/default.pid --conf-file= --bind-dynamic --interface virbr0 --dhcp-range 192.168.122.2,192.168.122.254 --dhcp-leasefile=/var/lib/libvirt/dnsmasq/default.leases --dhcp-lease-max=253 --dhcp-no-override --dhcp-hostsfile=/var/lib/libvirt/dnsmasq/default.hostsfile --addn-hosts=/var/lib/libvirt/dnsmasq/default.addnhosts) unexpected exit status 2: dnsmasq: failed to bind listening socket for ::1: Address already in use Created attachment 675724 [details]
proposed patch
------- Comment on attachment From onmahaja.com 2013-01-09 16:44 EDT-------
Note this -
--interface= : disables all interfaces except loop
--interface=virbr0 : disables all interfaces except virbr0 & loop
Hence,
# netstat -napt | grep :53
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 10417/dnsmasq
tcp 0 0 192.168.122.1:53 0.0.0.0:* LISTEN 10417/dnsmasq
excerpts from /var/log/messages :
Jan 9 11:02:34 oc2826874472 dnsmasq[5924]: failed to bind listening socket for 192.168.122.1: Address already in use
Jan 9 11:02:34 oc2826874472 dnsmasq[5924]: FAILED to start up
attached patch enables libvirt to issue dnsmasq with "--except-interface lo" - i.e., disabled loop interface
After restarting libvirt with this patch - libvirt issues dnsmasq -
/usr/sbin/dnsmasq --strict-order --local=// --domain-needed --pid-file=/var/run/libvirt/network/default.pid --conf-file= --bind-dynamic --except-interface lo --interface virbr0 --dhcp-range 192.168.122.2,192.168.122.254 --dhcp-leasefile=/var/lib/libvirt/dnsmasq/default.leases --dhcp-lease-max=253 --dhcp-no-override --dhcp-hostsfile=/var/lib/libvirt/dnsmasq/default.hostsfile --addn-hosts=/var/lib/libvirt/dnsmasq/default.addnhosts
and
# netstat -napt | grep :53
tcp 0 0 192.168.122.1:53 0.0.0.0:* LISTEN 308/dnsmasq
and guests get DHCP leased IPs
patch attach - please share your comments
I'm guessing you were redirected here from Bug 892448 (filed by IBM). Note that this BZ is in VERIFIED state, which means that it has already been fixed. You just need to update both dnsmasq and libvirt to at least the following versions: libvirt-0.10.2-13.el6.x86_64 dnsmasq-2.48-12.el6.x86_64 This removes "--bind-dynamic" from dnsmasq (which libvirt automatically detects) and modifies libvirt to still allow networks using public addresses as long as dnsmasq was built to use SO_BINDTODEVICE (which is now indicated in dnsmasq's --version output). You will then not need any other patch. ------- Comment From nabharay.com 2013-01-18 14:00 EDT------- Hi, I upgraded to the latest RHEL 6.4 snap3 kernel and now the guests are able to get DHCP ip's. [root@phx3 ~]# uname -a Linux phx3.in.ibm.com 2.6.32-353.el6.x86_64 #1 SMP Mon Jan 7 15:35:17 EST 2013 x86_64 x86_64 x86_64 GNU/Linux [root@phx3 ~]# rpm -qa|grep dnsmasq dnsmasq-2.48-13.el6.x86_64 [root@phx3 ~]# rpm -qa|grep libvirt libvirt-0.10.2-15.el6.x86_64 This issue can be closed now. Thanks, Nabhajit Ray Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0277.html |