During the openstack test day where everyone was testing f17 in a guest, a few people had networking issues due nested libvirt bringing up the default virtual network when it should have not autostarted. This was supposed to be previously fixed: https://bugzilla.redhat.com/show_bug.cgi?id=235961 http://libvirt.org/git/?p=libvirt.git;a=commit;h=a83fe2c23efad190a1e00e448f607fe032650fd6 My guess is that the switch to systemd means we are more likely to race against NetworkManager. Basically if NM hasn't brought up the VM host network before libvirt starts, there won't be any routing info in the host route table, so libvirt won't know _not_ to start the default network. That's my guess anyways. Maybe there is a simple unit file fix but I don't know it off hand.
> Maybe there is a simple unit file fix Maybe this could help: --- a/daemon/libvirtd.service.in +++ b/daemon/libvirtd.service.in @@ -9,6 +9,7 @@ After=syslog.target After=udev.target After=avahi.target After=dbus.target +After=network.target Before=libvirt-guests.service [Service]
*** Bug 807879 has been marked as a duplicate of this bug. ***
Transferring Beta blocker nomination. Hits "The installed system must be able to download and install updates with yum and the default graphical package manager in all release-blocking desktops" combined with "The release must be able host virtual guest instances of the same release, using Fedora's current preferred virtualization technology", basically. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
+1 blocker.
libvirt-0.9.10-2.fc17.1 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/libvirt-0.9.10-2.fc17.1
so the fix for this more or less works, *except* that we have: https://bugzilla.redhat.com/show_bug.cgi?id=806466 so no network connection is actually being brought up by default. Until you fix ONBOOT=no, libvirt will continue to bring up virbr0 at boot time, because no adapter gets brought up so there's no default route. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
actually, the fix doesn't seem bulletproof. I built a live image with it, booted it twice, and got a virbr0 both times. I think there's some subtleties to what network.service *means*, exactly. After=network.service may not really be sufficient. We might want to wait until DNS resolution is up, or something. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
+1 blocker That makes for +3 explicit blocker votes, +4 if you include adam's implicit vote. Moving to accepted
> I think there's some subtleties to what network.service *means*, exactly. > After=network.service may not really be sufficient. We might want to wait until > DNS resolution is up, or something. Yes, I doubt 'after=network.target' would be sufficient, since IIUC, that just means that networking has been *started*, it doesn't mean any of the interfaces are actually online yet. Particularly if DHCP is involved there may be quite a long delay.
I don't think there is any race free way to ensure strict ordering of network startup completion, vs libvirt virtual network startup. Indeed, I think it would bad idea if we tried to do that, because if I boot my laptop without any ethernet cable, and wifi turned off / no wifi hotspots around, I certainly still expect my libvirt virtual networks to be started. Perhaps an alternative idea is that, instead of trying to solve the general network clash problem (which is NP-complete), just target the specific case of running a VM inside a VM. If we detect that we have been installed in a virtual environment, then change the default network to use '192.168.123.0/24' instead of 122.0/24'. On the other hand this won't work for people building appliance images, since at the time the appliance is being built, we won't necessarily be in a VM
A further option is to refactor the libvirt RPMs again. Currently we have * libvirt - contains libvirtd, default config files & the rest * libvirt-client - containers libvirt.so & virsh * libvirt-devel - headers * libvirt-python - python binding Boxes pulls in libvirt, because it wants libvirtd. Since it uses qemu:///session Boxes does not actually need/want any of the default configs. We could refactor things thus * libvirt - empty package with no files, which just requires libvirt-configs & libvirt-daemon * libvirt-configs - default config files * libvirt-daemon - contains libvirtd & the rest * libvirt-client - containers libvirt.so & virsh * libvirt-devel - headers * libvirt-python - python binding Boxes would then dep on libvirt-daemon only, thus avoiding pulling the network config into the livecd spins
(In reply to comment #12) > A further option is to refactor the libvirt RPMs again. > > Currently we have > > * libvirt - contains libvirtd, default config files & the rest > * libvirt-client - containers libvirt.so & virsh > * libvirt-devel - headers > * libvirt-python - python binding > > Boxes pulls in libvirt, because it wants libvirtd. Since it uses > qemu:///session Boxes does not actually need/want any of the default configs. > > We could refactor things thus > > * libvirt - empty package with no files, which just requires libvirt-configs & > libvirt-daemon > * libvirt-configs - default config files > * libvirt-daemon - contains libvirtd & the rest > * libvirt-client - containers libvirt.so & virsh > * libvirt-devel - headers > * libvirt-python - python binding > > Boxes would then dep on libvirt-daemon only, thus avoiding pulling the network > config into the livecd spins I think that's an excellent idea.
Package libvirt-0.9.10-2.fc17.1: * should fix your issue, * was pushed to the Fedora 17 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing libvirt-0.9.10-2.fc17.1' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-4870/libvirt-0.9.10-2.fc17.1 then log in and leave karma (feedback).
okay, so we should really try to get a usable fix for this in and a new RC done tomorrow (Friday). The fallback position is simply to nuke gnome-boxes from comps; it does a neat end-run around the problem. But it'd be ideal to fix it properly in libvirt, of course. CCing some systemd folks. systemd folks, we'd like to be able to reliably bring the libvirtd service up after the system network connection is actually up, to the point where routes are in place (as it's trying to check the route table to know whether to bring up virbr0 or not). Is this possible? libvirt folks, is there any other way to go about fixing this? systemd does have the fairly neat capability of knowing when it's running in a VM. You can use the ConditionVirtualization=!kvm parameter in a systemd unit to make it fire only if the system it's running on is *not* a KVM. Just as a seed of an idea, you could split out the activation of virbr0 as a separate service with this condition, so it wouldn't happen when running in a KVM. I don't know if this sophisticated enough, though - I guess there may be times when you actually want a virbr0 in a KVM. Just planting the idea, though. There may be other approaches. I suppose you could have a service which uses a different subnet for virbr0 if booted in a KVM? Either way, we need to figure out an approach by tomorrow, I think. So worst case, if you (developers) don't think it's likely we'll work out a good approach here soon, please let us know so we can just take gnome-boxes out of comps for the Beta. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
woop, forgot to take into account Daniel's two ideas above, sorry. those both sound like promising approaches (and one of my ideas is similar to the first).
I am working on changing the RPM layout in libvirt & will fix the Boxes dep
New RPM proposal upstream https://www.redhat.com/archives/libvir-list/2012-March/msg01332.html
Not sure if it's helpfull, but another possible tool might be 'nm-online' ? That would let you see if NetworkManager is running and just hasn't gone on line yet or not. It would timeout in the case where NM was not set to bring up network on boot though, so likely not a total solution.
Ultimately it would be desirable if libvirt were able to actually use NetworkManager APIs for creating its NAT based network, at which point NM itself can trivially detect the IP range clashes. That ability is some way off in the future though, unfortunately
Updated RPM proposal after testing & further feedback https://www.redhat.com/archives/libvir-list/2012-March/msg01352.html
Discussed at 2012-03-30 blocker review meeting. The RPM fix is still under review, and kalev says gnome-boxes was only added to comps to see what the effect would be on the nightly image size anyway. We've been testing without gnome-boxes in the default package set all along. So let's just drop it out of comps again, and fix up libvirt for final. I've now dropped gnome-boxes from comps again, so we can consider this 'addressed' for blocker purposes, and drop it as a Beta blocker after RC3 is spun assuming the change takes effect as intended. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
libvirt-glib-0.0.7-1.fc17, gnome-boxes-3.4.0.1-2.fc17, libvirt-0.9.10-3.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/FEDORA-2012-4939/gnome-boxes-3.4.0.1-2.fc17,libvirt-0.9.10-3.fc17,libvirt-glib-0.0.7-1.fc17
Package libvirt-glib-0.0.7-1.fc17, gnome-boxes-3.4.0.1-2.fc17, libvirt-0.9.10-3.fc17: * should fix your issue, * was pushed to the Fedora 17 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing libvirt-glib-0.0.7-1.fc17 gnome-boxes-3.4.0.1-2.fc17 libvirt-0.9.10-3.fc17' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-4939/gnome-boxes-3.4.0.1-2.fc17,libvirt-0.9.10-3.fc17,libvirt-glib-0.0.7-1.fc17 then log in and leave karma (feedback).
Discussed at 2012-04-04 go/no-go meeting. Agreed this is no longer a blocker as gnome-boxes was dropped out of comps again in time for Beta RC3 compose, so the issue does not arise in a default live boot/install or default desktop DVD install. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
I've just updated to libvirt-0.9.11-1.fc17.x86_64 in my F17 desktop, restarted libvirtd (of course), and now, no guest has networking.
(In reply to comment #26) > I've just updated to libvirt-0.9.11-1.fc17.x86_64 in my F17 desktop, restarted > libvirtd (of course), and now, no guest has networking. That sounds like a separate problem from the one detailed in this bug (which only struck guests that themselves had the libvirt package installed) Can you expand on the symptoms? 1) I assume you were previously running 0.9.10? 2) what is the output of the following: # brctl show # iptables -S; iptables -t nat -S # virsh net-list --all # virsh net-dumpxml $netname (for each network used by a guest) # virsh dumpxml $guestname 3) have the guests been restarted since the libvirt upgrade? (it shouldn't be necessary, I'm just curious if that solves the problem). Find me on irc if you want to troubleshoot this in real time.
(In reply to comment #27) > (In reply to comment #26) > > I've just updated to libvirt-0.9.11-1.fc17.x86_64 in my F17 desktop, restarted > > libvirtd (of course), and now, no guest has networking. > > That sounds like a separate problem from the one detailed in this bug (which > only struck guests that themselves had the libvirt package installed) Hi Laine, Thanks for the follow-up. > Can you expand on the symptoms? Sure (remember, this is F17). No AVCs, guest starts, but fails to start its network. A manual "dhclient eth0" in the guest fails to find an offer. > 1) I assume you were previously running 0.9.10? Yep. 0.9.10-3 > 2) what is the output of the following: > > # brctl show bridge name bridge id STP enabled interfaces virbr0 8000.5254008b9d18 yes virbr0-nic vnet0 vnet1 > # iptables -S; iptables -t nat -S -P INPUT ACCEPT -P FORWARD ACCEPT -P OUTPUT ACCEPT -N FORWARD_ZONES -N FORWARD_direct -N FWDO_ZONE_external -N FWDO_ZONE_external_allow -N FWDO_ZONE_external_deny -N INPUT_ZONES -N INPUT_direct -N IN_ZONE_dmz -N IN_ZONE_dmz_allow -N IN_ZONE_dmz_deny -N IN_ZONE_external -N IN_ZONE_external_allow -N IN_ZONE_external_deny -N IN_ZONE_home -N IN_ZONE_home_allow -N IN_ZONE_home_deny -N IN_ZONE_internal -N IN_ZONE_internal_allow -N IN_ZONE_internal_deny -N IN_ZONE_public -N IN_ZONE_public_allow -N IN_ZONE_public_deny -N IN_ZONE_work -N IN_ZONE_work_allow -N IN_ZONE_work_deny -N OUTPUT_direct -A INPUT -m conntrack --ctstate INVALID -j REJECT --reject-with icmp-host-prohibited -A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -A INPUT -i lo -j ACCEPT -A INPUT -j INPUT_direct -A INPUT -j INPUT_ZONES -A INPUT -p icmp -j ACCEPT -A INPUT -j REJECT --reject-with icmp-host-prohibited -A FORWARD -m conntrack --ctstate INVALID -j REJECT --reject-with icmp-host-prohibited -A FORWARD -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -A FORWARD -i lo -j ACCEPT -A FORWARD -j FORWARD_direct -A FORWARD -j FORWARD_ZONES -A FORWARD -p icmp -j ACCEPT -A FORWARD -j REJECT --reject-with icmp-host-prohibited -A OUTPUT -j OUTPUT_direct -A FWDO_ZONE_external -j FWDO_ZONE_external_deny -A FWDO_ZONE_external -j FWDO_ZONE_external_allow -A FWDO_ZONE_external_allow -j ACCEPT -A IN_ZONE_dmz -j IN_ZONE_dmz_deny -A IN_ZONE_dmz -j IN_ZONE_dmz_allow -A IN_ZONE_dmz_allow -p tcp -m tcp --dport 22 -j ACCEPT -A IN_ZONE_external -j IN_ZONE_external_deny -A IN_ZONE_external -j IN_ZONE_external_allow -A IN_ZONE_external_allow -p tcp -m tcp --dport 22 -j ACCEPT -A IN_ZONE_home -j IN_ZONE_home_deny -A IN_ZONE_home -j IN_ZONE_home_allow -A IN_ZONE_home_allow -p tcp -m tcp --dport 22 -j ACCEPT -A IN_ZONE_home_allow -p udp -m udp --dport 631 -j ACCEPT -A IN_ZONE_home_allow -d 224.0.0.251/32 -p udp -m udp --dport 5353 -j ACCEPT -A IN_ZONE_home_allow -p udp -m udp --dport 137 -j ACCEPT -A IN_ZONE_home_allow -p udp -m udp --dport 138 -j ACCEPT -A IN_ZONE_internal -j IN_ZONE_internal_deny -A IN_ZONE_internal -j IN_ZONE_internal_allow -A IN_ZONE_internal_allow -p tcp -m tcp --dport 22 -j ACCEPT -A IN_ZONE_internal_allow -p udp -m udp --dport 631 -j ACCEPT -A IN_ZONE_internal_allow -d 224.0.0.251/32 -p udp -m udp --dport 5353 -j ACCEPT -A IN_ZONE_internal_allow -p udp -m udp --dport 137 -j ACCEPT -A IN_ZONE_internal_allow -p udp -m udp --dport 138 -j ACCEPT -A IN_ZONE_public -j IN_ZONE_public_deny -A IN_ZONE_public -j IN_ZONE_public_allow -A IN_ZONE_public_allow -p tcp -m tcp --dport 22 -j ACCEPT -A IN_ZONE_work -j IN_ZONE_work_deny -A IN_ZONE_work -j IN_ZONE_work_allow -A IN_ZONE_work_allow -p tcp -m tcp --dport 22 -j ACCEPT -A IN_ZONE_work_allow -p udp -m udp --dport 631 -j ACCEPT -P PREROUTING ACCEPT -P INPUT ACCEPT -P OUTPUT ACCEPT -P POSTROUTING ACCEPT -N OUTPUT_direct -N POSTROUTING_ZONES -N POSTROUTING_direct -N POST_ZONE_external -N POST_ZONE_external_allow -N POST_ZONE_external_deny -N PREROUTING_ZONES -N PREROUTING_direct -A PREROUTING -j PREROUTING_direct -A PREROUTING -j PREROUTING_ZONES -A OUTPUT -j OUTPUT_direct -A POSTROUTING -j POSTROUTING_direct -A POSTROUTING -j POSTROUTING_ZONES -A POST_ZONE_external -j POST_ZONE_external_deny -A POST_ZONE_external -j POST_ZONE_external_allow -A POST_ZONE_external_allow -j MASQUERADE > # virsh net-list --all Name State Autostart ----------------------------------------- default active yes > # virsh net-dumpxml $netname (for each network used by a guest) <network> <name>default</name> <uuid>b6503b9d-a544-4ac0-b455-ef42e64e93d3</uuid> <forward mode='nat'/> <bridge name='virbr0' stp='on' delay='0' /> <mac address='52:54:00:8B:9D:18'/> <ip address='192.168.122.1' netmask='255.255.255.0'> <dhcp> <range start='192.168.122.2' end='192.168.122.254' /> </dhcp> </ip> </network> > # virsh dumpxml $guestname <domain type='kvm' id='7'> <name>ri</name> <uuid>beca45eb-f935-74ba-0b7e-4762bbee2c55</uuid> <memory unit='KiB'>14540800</memory> <currentMemory unit='KiB'>14540800</currentMemory> <vcpu>2</vcpu> <os> <type arch='x86_64' machine='pc-0.15'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <pae/> </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/bin/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='raw'/> <source file='/h/vm/ri.img'/> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> <disk type='block' device='cdrom'> <driver name='qemu' type='raw'/> <target dev='hdc' bus='ide'/> <readonly/> <alias name='ide0-1-0'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> <controller type='usb' index='0'> <alias name='usb0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='ide' index='0'> <alias name='ide0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <interface type='network'> <mac address='52:54:00:48:cc:aa'/> <source network='default'/> <target dev='vnet0'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/41'/> <target port='0'/> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/41'> <source path='/dev/pts/41'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <input type='tablet' bus='usb'> <alias name='input0'/> </input> <input type='mouse' bus='ps2'/> <graphics type='vnc' port='5900' autoport='yes'/> <video> <model type='cirrus' vram='9216' heads='1'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </memballoon> </devices> <seclabel type='dynamic' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c65,c559</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c65,c559</imagelabel> </seclabel> </domain> > 3) have the guests been restarted since the libvirt upgrade? (it shouldn't be > necessary, I'm just curious if that solves the problem). Yes, numerous times.
Okay, I see a combination of two problems in Jim's report, both unrelated to the problem addressed in this bug: 1) when firewalld is started, it indiscriminately clears out all iptables rules, including those that libvirtd has setup for its virtual networks (the presence of "IN_ZONE_*" etc. in the iptables output shows that firewalld is running). firewalld is so far not compatible with libvirt for this reason (although a firewalld developer is working on it). 2) Normally the way to get libvirtd to reload its iptables rules (which would temporarily remedy the situation) is to restart libvirtd. However, in my tests on F17 just now, "sysctemctl restart libvirtd.service" ends up killing all qemu-kvm processes AND all dnsmasq processes that had been started by libvirtd; the result is the guests that had been running are unceremoniously whacked, and dhcp will not work when the guests are restarted, so they won't get an IP address. I recall seeing mention of problem 2 wrt cgroups "somewhere" recently, but can't find the reference now. As I said above, though, this is all unrelated to the problem given in the summary of this bug.
Bug 805942 reports the problem of sub-processes being killed when libvirtd is restarted.
(In reply to comment #29) > Okay, I see a combination of two problems in Jim's report, both unrelated to > the problem addressed in this bug: > > 1) when firewalld is started, it indiscriminately clears out all iptables ... > > 2) Normally the way to get libvirtd to reload its iptables rules (which would > temporarily remedy the situation) is to restart libvirtd. However, in my tests > on F17 just now, "sysctemctl restart libvirtd.service" ends up killing all > qemu-kvm processes AND all dnsmasq processes that had been started by libvirtd; > the result is the guests that had been running are unceremoniously whacked, and > dhcp will not work when the guests are restarted, so they won't get an IP > address. ... Nice analysis. I confirm that disabling firewalld and rebooting this F17 host makes it so it can now start all of my guests as usual. Thanks, Laine!
I'm having the issue with firewalld too. Is there a bug report for it yet? -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
libvirt-glib-0.0.7-1.fc17, gnome-boxes-3.4.0.1-2.fc17, libvirt-0.9.11-1.fc17 has been pushed to the Fedora 17 stable repository. If problems still persist, please make note of it in this bug report.
libvirt-0.9.11.3-1.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/libvirt-0.9.11.3-1.fc17
libvirt-0.9.11.3-1.fc17 has been pushed to the Fedora 17 stable repository. If problems still persist, please make note of it in this bug report.