Bug 813853
Summary: | libvirt network fails rarely - maybe dnsmasq problem | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Steven Dake <sdake> | ||||||
Component: | libvirt | Assignee: | Libvirt Maintainers <libvirt-maint> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 16 | CC: | berrange, calfonso, clalancette, crobinso, dougsland, itamar, jforbes, jyang, laine, libvirt-maint, veillard, virt-maint | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2012-06-07 21:06:18 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Steven Dake
2012-04-18 15:19:22 UTC
s/l/tracing dnsmasq when it comes into that situation may help understand where the problem comes from. when you hit the issue, don't kill the process immediately but run strace -o /tmp/dnsmasq.log -p `pidof dnsmasq` and try to boot a VM, then after it failed, kill the process and append the log, thanks ! Daniel DV Thanks We will give that a go. We will also try booting a different image rather then oz-based when it locks just to verify it isn't some wierd oz output wedging libvirt (if it is, we could provide the output which may be helpful). After we will kill -HUP to see if that restarts the network. This problem doesn't happen all that often unfortunately. Regards -steve Created attachment 585828 [details]
This is the strace while booting the guest vm.
Created attachment 585829 [details]
This is a screenshot with virt-viewer showing the guest config and the host network interfaces
I had originally created an openstack nova network using virbr0 as the bridge. After removing that network and creating a new nova network using a different arbitrary name of demonetbr0, the network on the guest comes up without any problems. chris, yeah that virbr0 name was likely clashing with libvirt's default network. Steven, is killing dnsmasq manually a requirement? Or does virsh net-destroy on its own work? Any change something could be mucking with firewall rules on the host? This can wipe out the rules that libvirt needs for NAT. net-destroy gets the job done if I recall using openstack in the system, it makes all kinds of iptable changes. Long known issue which won't be fixed until we have firewalld by default which libvirt and all other iptables users talk too. Which is like F18 time frame. So this is WONTFIX for F16 Cole, Unclear how a conclusion can be made that changing the firewall will break dnsmasq without clear evidence. libvirt adds iptables rules to (among other things) allow incoming DHCP from the virt guests to the host. If somebody else messes with the iptables rules and happens to add another rule above this particular rule, dhcp requests from the guest will no longer make it to the dnsmasq running on the host. This is just one example of many problems that can occur due to the fact that there is no central controlling authority for iptables rules, and no concept of priority so that the ordering of the rules can remain consistent regardless of the ordering of their insertion. To verify if this is the source of the problem, during a time when the system is "wedged", just run "iptables -S" and see if there is a REJECT or DROP rule that would match the dhcp packets that occurs above the rule to allow them. Also, when the networking is in ts wedged state, try restarting libvirtd to see if that un-wedges it - restarting libvirtd will reload libvirt's iptables rules and re-enable ip_forward without making any other changes to the network plumbing. Steven, sorry, wasn't trying to be rash, it's just that 95% of all networking issues filed against libvirt over the years have been some incarnation of this root issue. If you find evidence to the contrary, like as Laine requested in Comment #10, please reopen this bug and we can go from there. But until then keeping this open isn't helpful IMO BTW, just a couple days ago I made a change to the system firewall with the firewall applet, and hit "Apply", and found that guests could no longer acquire a DHCP lease. When I looked at the iptables output, I found that, as we've discussed above, the rule to allow dhcp packets on the INPUT chain had been removed along with most/everything else added by libvirt). Restarting libvirtd was enough to reload libvirt's iptables rules and get dnsmasq working properly again. So, this isn't conclusive, but I did experience the exact same symptoms and the cause was just as Cole surmised. |