Bug 850606
Summary: | libvirtd should check for properly running dnsmasq on networks presumed "active" at startup (and start one if necessary) | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Scott Baker <scott> |
Component: | libvirt | Assignee: | Laine Stump <laine> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 18 | CC: | berrange, clalancette, crobinso, itamar, jforbes, jyang, laine, libvirt-maint, veillard, virt-maint |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2012-10-01 21:37:36 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Scott Baker
2012-08-21 23:31:03 UTC
I'm using "network" instead of "NetworkManager" if that's somehow related. virsh net-destroy default; virsh net-start default causes dnsmasq to restart. Working with laine on IRC, if dnsmasq crashes or is killed, and virbr0 is still present, restarting libvirtd will *NOT* restart dnsmasq. dnsmasq is only started by libvirtd if it thinks the network is not already up, which it determines by seeing if the virbr0 device is present. Part of libvirtd turning up the networks should probably confirm that dnsmasq is running, and if not, start it. I agree we should be checking for dnsmasq and restarting it if needed (and probably giving it a SIGHUP even if it's there, just for good measure). An aside: In our discussion on IRC, you figured out that you had run /etc/init.d/dnsmasq restart" and that had killed dnsmasq. But that script doesn't exist on F16, because it has switched to using systemd. (and when I run "service dnsmasq restart" or "systemctl restart dnsmasq.service", it fails and doesn't kill all of libvirtd's dnsmasq instances). Was that original behavior only seen on CentOS, and you just verified the result on F16 by manually killing the dnsmasq processes? Or is there some other weird circumstance that causes dnsmasq processes to be killed? It was originally seen on CentOS. I tried it on F16, but forcibly killing dnsmasq first, wanting to see if it would restart. Okay, so there isn't a separate "dnsmasq is myseteriously dying" bug on F16. That's good to know :-) I've changed the summary of this BZ to more accurately reflect what's needed from libvirt. Thanks for the report and extra investigation! Upstream libvirt has been enhanced to restart radvd/dnsmasq when needed when libvirtd is restarted. It will also send a SIGHUP to all dnsmasq and radvd processes when libvirtd is restarted. The following two commits are required for this new behavior. I'm not sure how easily they will backport to the libvirt that's in F16 (which this BZ is filed against) or F17, but they will be in 0.10.2, which means they will automatically be in F18. If the backport isn't trivial, we may want to consider marking this as CLOSED/NEXTRELEASE or CLOSED/UPSTREAM instead. commit 4cf974b67427e33e3ce38df4787cddd6e2822d67 Author: Laine Stump <laine> Date: Sun Sep 16 21:22:27 2012 -0400 network: restart radvd/dnsmasq if needed when libvirtd is restarted A user on IRC had accidentally killed all of his libvirt-started dnsmasq instances (due to a buggy dnsmasq service script in Fedora 16), and had hoped that libvirtd would notice this on restart and reload all the dnsmasq daemons (as it does with iptables rules). Unfortunately this was not the case - as long as the network object had a pid registered for dnsmasq and/or radvd, it assumed that the processes were running. This patch takes advantage of the new utility functions in bridge_driver.c to do a "refresh" of all radvd and dnsmasq processes started by libvirt each time libvirtd is restarted - this function attempts to do a SIGHUP of each existing process, and if that fails, it restarts the process, rebuilding all the associated config files and commandline parameters in the process. This normally has no effect, but will be useful in solving the occasional "odd situation" without needing to take the drastic step of destroying/re-starting the network. commit 1ce4922e720e125421b3f8061d0eb6fdd152c41a Author: Laine Stump <laine> Date: Mon Aug 20 00:59:46 2012 -0400 network: reorganize dnsmasq and radvd config file / startup This patch splits the starting of dnsmasq and radvd into multiple files, and adds new networkRefreshXX() and networkRestartXX() functions for each. These new functions are currently commented out because they won't be used until the next commit, and the compile options require all static functions to be used. networkRefreshXX() - rewrites any file-based config for dnsmasq/radvd, and sends SIGHUP to the process to make it reread its config. If the program isn't already running, it's just started. networkRestartXX() - kills the given program, waits for it to exit (see the comments in the function networkKillDaemon()), then calls networkStartXX(). This commit is here mostly as a checkpoint to verify no change in functional behavior after refactoring networkStartXX() functions to fit in with these new functions. Amazingly these patches apply cleanly to F16 maint. However given the size of the changes, the (hopefully) rarity of the issue, and the fact that there's a workaround (destroy, start), I don't plan on backporting these to the maintenance branches. Moving to F18. Aaaaand libvirt 0.10.2 is already in F18, so just closing as CURRENTRELEASE |