811967 – libvirt in a VM often brings up 'default' network when it shouldn't, kills vm networking

Bug 811967 - libvirt in a VM often brings up 'default' network when it shouldn't, kills vm networking

Summary: libvirt in a VM often brings up 'default' network when it shouldn't, kills vm...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	libvirt
Sub Component:
Version:	21
Hardware:	All
OS:	Linux
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Laine Stump
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Duplicates (2):	1134544 1136765 (view as bug list)
Depends On:	802475
Blocks:	F21AlphaFreezeException F21BetaBlocker
TreeView+	depends on / blocked

Reported:	2012-04-12 12:17 UTC by Cole Robinson
Modified:	2016-04-26 15:21 UTC (History)
CC List:	38 users (show)
Fixed In Version:	libvirt-1.2.8-4.fc21
Clone Of:	802475
Environment:
Last Closed:	2014-09-23 04:20:47 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1146284	0	unspecified	CLOSED	libvirt in a VM often brings up 'default' network when it shouldn't, kills vm networking (upstream)	2021-02-22 00:41:40 UTC

Internal Links: 1146232 1146284

Description Cole Robinson 2012-04-12 12:17:47 UTC

+++ This bug was initially created as a clone of Bug #802475 +++

During the openstack test day where everyone was testing f17 in a guest, a few people had networking issues due nested libvirt bringing up the default virtual network when it should have not autostarted. This was supposed to be previously fixed:

https://bugzilla.redhat.com/show_bug.cgi?id=235961
http://libvirt.org/git/?p=libvirt.git;a=commit;h=a83fe2c23efad190a1e00e448f607fe032650fd6

We changed packaging so boxes doesn't need to pull in the default network which limits the opportunity for this to hit. But the root cause remains: if the default network is installed and libvirt starts before NetworkManager connects, guest networking will be hosed.

Comment 1 Cole Robinson 2012-04-18 15:20:57 UTC

A couple possible solutions:

1) Use virt-what at rpm install time: if we are running in a VM, change the default network to 192.168.123.1. Doesn't help existing users. Problems with doubly nested virt.

2) Have libvirt listen to networkmanager changes, and if we see a route pop up that conflicts with a running virtual network, shut down the virtual network. Pretty scary to just stop the network, not positive it's possible (haven't looked at the NM api).

Comment 2 Fedora End Of Life 2013-07-04 06:12:51 UTC

This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 3 Colin Walters 2013-08-16 16:06:17 UTC

Another random idea:

Check in the libvirt %post if we're running in a VM, if so do not enable the libvirtd service by default.

Comment 4 Adam Williamson 2013-08-16 16:58:24 UTC

colin: systemd has a conditional for that, in fact: ConditionVirtualization= . It might be cleaner to use that. Just this in the .service file would achieve what you wanted (it'll stop the service starting up if we're running in a KVM):

ConditionVirtualization=!kvm

Comment 5 Daniel Berrangé 2013-08-16 17:12:35 UTC

We don't want to differnet libvirtd startup policies depending on whether or not libvirtd is inside a guest. We have users who have nested KVM or use LXC insides guests.

Comment 6 Colin Walters 2013-08-16 17:15:26 UTC

(In reply to Adam Williamson from comment #4)
> colin: systemd has a conditional for that, in fact: ConditionVirtualization=
> . It might be cleaner to use that. Just this in the .service file would
> achieve what you wanted (it'll stop the service starting up if we're running
> in a KVM):
> 
> ConditionVirtualization=!kvm

The problem is that it is valid to use libvirt inside a VM; for many reasons:

1) libvirt can manage containers, containers can run in a VM
2) You may want to test libvirt, and having to override the systemd unit file to explicitly disable the ConditionVirtualization would be annoying
3) Nested virt is coming

So I think we need a solution that makes the whole thing more dynamic - *only* when you start Boxes do you get the nested libvirt, for example.

(This is all doubly ironic because Boxes uses qemu:///session which doesn't even make use of the NAT network libvirt starts for qemu:///system)

Comment 7 Colin Walters 2013-08-16 17:28:42 UTC

So...another option.  Right now the libvirt.service file has a comment saying it doesn't do socket activation because libvirt wants to spawn any guests configured to autostart.

But what if libvirt wrote out unit files for these guests, with e.g.:
ExecStart=virsh -c qemu:///system start foo
Then they'd socket activate libvirtd.

And in the normal case of not having any guests configured to autostart, libvirtd wouldn't start either.

Comment 8 Daniel Berrangé 2013-08-16 17:49:45 UTC

Yep, we'd like to see if we can integrate with systemd for autostart of guests, storage, networks, & the like, instead of doing it inside libvirtd. No work has been done on this yet though.

Comment 9 Adam Williamson 2014-06-05 00:48:04 UTC

fwiw, I'm seeing this (I think it's this) in Rawhide lives ATM. Just built one to test something else, and on some boots of that live image in a VM, I have a running dnsmasq process belonging to libvirtd.service in the guest and can't get out to the internet. (On other boots of the same live image, no dnsmasq process, and internet access works).

Comment 10 satellitgo 2014-06-05 17:49:38 UTC

saw this in bare metal install of f21 workstation-live-(rawhide)x86_64 20140604 from DVD
only bridged networking available I set up wired and wireless in running live DVD first and it did not transfer settings to install. Unable to add in settings/network in gnome - no eth0 or wireless in (+) add drop down.

Comment 11 Adam Williamson 2014-06-05 19:17:52 UTC

satellit: I don't think that is the same problem at all, this particular bug is specific to VMs, I believe. Can you report it separately?

Comment 12 satellitgo 2014-06-06 00:03:54 UTC

(In reply to Adam Williamson from comment #11)
> satellit: I don't think that is the same problem at all, this particular bug
> is specific to VMs, I believe. Can you report it separately?

https://bugzilla.redhat.com/show_bug.cgi?id=1105358

Comment 13 Adam Williamson 2014-07-04 23:27:45 UTC

Just seen this again in a fresh Rawhide install, preventing network access from the guest. Proposing as an F21 Beta blocker, per the virt criteria - https://fedoraproject.org/wiki/Fedora_21_Beta_Release_Criteria#Virtualization_requirements . OK, it *works*, but no network access is a pretty bad bug.

Comment 14 Adam Williamson 2014-07-21 23:49:58 UTC

so, yeah, s/occasionally/often/ . I did a whole bunch of F21/F22 live image boots to test https://bugzilla.redhat.com/show_bug.cgi?id=1121301 on Friday, and hit this bug more often than not. Any chance anyone might feel like fixing it?

For anyone playing along at home, to get networking working, do:

ip link set virbr0 down
brctl delbr virbr0

Comment 15 Cole Robinson 2014-07-27 21:34:21 UTC

(In reply to Adam Williamson (Red Hat) from comment #14)
> so, yeah, s/occasionally/often/ . I did a whole bunch of F21/F22 live image
> boots to test https://bugzilla.redhat.com/show_bug.cgi?id=1121301 on Friday,
> and hit this bug more often than not. Any chance anyone might feel like
> fixing it?
> 
> For anyone playing along at home, to get networking working, do:
> 
> ip link set virbr0 down
> brctl delbr virbr0

The libvirty way is:

sudo virsh net-destroy default
sudo virsh net-autostart --disable default

Will fix things to not mess up on reboot as well. But not sure if those commands will make network connectivity magically return or if NetworkManager will need a restart.

I just sent a mail to libvir-list asking for ideas and/or volunteers:

http://www.redhat.com/archives/libvir-list/2014-July/msg01379.html

Comment 16 Adam Williamson 2014-08-27 18:22:00 UTC

*** Bug 1134544 has been marked as a duplicate of this bug. ***

Comment 17 Laine Stump 2014-08-27 21:16:41 UTC

This is the same problem as RHEL7 Bug 956891

Comment 18 Elad Alfassa 2014-08-28 14:34:04 UTC

Note that I don't recommend this workaround for people who have pre-existing machines in GNOME Boxes - it fixes network for new machines but makes old machines (that are configured for bridged networking and not usermode networking) to fail to boot :(

Comment 19 Adam Williamson 2014-08-29 06:31:58 UTC

you're meant to run those commands in the *guest*, not the host. running them on the host would kill networking for all guests that use the 'default' NAT network, of course.

Comment 20 Cole Robinson 2014-09-09 13:13:53 UTC

*** Bug 1136765 has been marked as a duplicate of this bug. ***

Comment 21 Jens Petersen 2014-09-10 06:50:35 UTC

(I think it wouldn't hurt to have Fedora bug too to track this
since this is affecting F21Alpha guests frequently.)

Anyway what is the simplest workaround for F21 Alpha?
"yum remove libvirt-client; reboot"?
(I mean assuming one doesn't want guest virt.)

Should this issue be an F21 Alpha blocker or Common_Bug at least?
When one hits this for the first time it is not very obvious
what is going on...

Comment 22 Adam Williamson 2014-09-10 06:56:20 UTC

jens: virt stuf is beta (not alpha) blocking by policy (all the criteria are at beta). I support alpha FE in theory, but it seems like it's not a simple bug to fix or it'd've been fixed by now, so I suspect any fix that comes will be too large for a freeze exception at this point.

Comment 23 Jens Petersen 2014-09-10 07:41:32 UTC

Okay fair enough, thanks.

I tried "yum remove libvirt-client; reboot" and that seemed to get
the network working again on my f21alpha guest (virtbr0 is gone).

Comment 24 Cole Robinson 2014-09-10 13:22:46 UTC

Sorry, though this was a fedora bug, changing product.

Comment 25 Laine Stump 2014-09-10 13:56:39 UTC

I was thinking about this again last night. In the past we've tried the following things:

1) add code to libvirt to prevent starting a virtual network if it creates a route that conflicts with an existing route (same subnet + netmask).

2) split the default network config off into a separate sub package (libvirt-daemon-config-network) and tell people setting up "spins" that when they want libvirt installed, but don't want the mess caused by the default network, they shouldn't install that package.

Both of these items were useful, but didn't solve the problem - (1) is ineffective because the guest's network interface is often configured to use dhcp, and libvirt is often started prior to dhclient acquiring the conflicting IP address (and there is no reasonable way to know beforehand if that is going to happen. (2) doesn't work because many people running nested virtt actually *do* want the default virtual network to exist, so it seems that nobody has actually removed libvirt-daemon-config-network from their list of packages.

People have also proposed a) not installing the default network config if running in a virtual guest, and b) changing the address of the default network if running in a virtual guest. (a) doesn't seem useful, since as I said many people really do want a default network in those cases, and (b) could have exactly the opposite of the desired effect if someone had already modified their L1 host's default network config (in anticipation of this problem) to, for example, 192.168.123.0/24, and our "fix" also happened to use 192.168.123.0/24 for the guest's default network.

So how about this:

We modify the rpm script that creates the default network so that it checks for conflicting routes at install time, and if there is a conflict, it searches through the 192.168.x.0/24 networks (starting at 122, for historical reasons) until it finds one that has no currently conflicting route. I'm guessing that during install, the guest will already have its network interfaces configured and up, so we wouldn't run into the same runtime race we have now, and I'm also assuming that at least 99% of people installing an OS will keep the network config they used during installation.

Does this sound like an acceptable solution? If so, I'll look into implementing that - we already modify the default network's <uuid> element in the "%post daemon-config-network" section of the rpm, so it wouldn't be a huge leap to modify the network address as well (although it could be a bit complicated to do with a bash script :-P)

Comment 26 Cole Robinson 2014-09-10 14:01:22 UTC

Sounds crazy, but it's basically what we are forced to do :) So if it works, I'm all for it.

Comment 27 Daniel Berrangé 2014-09-10 14:05:11 UTC

We've thought of doing that before, but one of the limitations of that is that it does not help pre-built cloud images, since the place you are building those is not the same as the place you are deploying them.

Comment 28 Elad Alfassa 2014-09-10 14:08:55 UTC

(In reply to Laine Stump from comment #25)
> so it seems that nobody has actually removed libvirt-daemon-config-network from their list of packages.


libvirt-deamon-config-network is a gnome-boxes dependency so it will be installed by default (at least on Workstation)  - it is required for bridged networking which works much better than usermode networking. People running Workstation on real hardware should be able to enjoy fast networking in their VM without having to install extra packages.

Comment 29 Laine Stump 2014-09-10 14:39:03 UTC

(In reply to Daniel Berrange from comment #27)
> We've thought of doing that before, but one of the limitations of that is
> that it does not help pre-built cloud images, since the place you are
> building those is not the same as the place you are deploying them.

Ugh. Right. I would hope that someone creating a pre-built cloud image would take the time to do more tweaking to the image though, perhaps manually forcing the address to something different.

And in the meantime, although what I'm suggesting doesn't fix the problem for pre-built cloud images, I don't think it would *hurt* anything in that case, would it? So in the end it wouldn't fix the problem for everyone, but could fix it for many of them.

(In reply to Elad Alfassa from comment #28)
> libvirt-deamon-config-network is a gnome-boxes dependency so it will be
> installed by default (at least on Workstation)  - it is required for bridged
> networking which works much better than usermode networking. People running
> Workstation on real hardware should be able to enjoy fast networking in
> their VM without having to install extra packages.

So you're saying that gnome-boxes is installed by default on Fedora Workstation, correct?

This is a good example of why the idea of "don't install the default network config package" wasn't a generally useful solution.

Comment 30 Elad Alfassa 2014-09-10 14:40:26 UTC

(In reply to Laine Stump from comment #29)
> So you're saying that gnome-boxes is installed by default on Fedora
> Workstation, correct?

Indeed.

Comment 31 Daniel Berrangé 2014-09-10 14:56:33 UTC

(In reply to Laine Stump from comment #29)
> (In reply to Elad Alfassa from comment #28)
> > libvirt-deamon-config-network is a gnome-boxes dependency so it will be
> > installed by default (at least on Workstation)  - it is required for bridged
> > networking which works much better than usermode networking. People running
> > Workstation on real hardware should be able to enjoy fast networking in
> > their VM without having to install extra packages.
> 
> So you're saying that gnome-boxes is installed by default on Fedora
> Workstation, correct?
> 
> This is a good example of why the idea of "don't install the default network
> config package" wasn't a generally useful solution.

It is certainly useful from the POV or OpenStack or oVirt, both of which have deployment approaches that avoid pulling in the default network.  Fedora Workstation isn't the only usage scenario we need to care about.

Comment 32 Laine Stump 2014-09-10 15:08:30 UTC

Definitely. It wasn't my intent to imply that it isn't useful for anybody (although I guess my statement can be parsed that way), just that it doesn't solve the problem for *everybody* :-)

Comment 33 Adam Williamson 2014-09-10 16:05:11 UTC

Sounds workable, but then we would need to have libvirt up (with the default network running) on the box running live composes, I believe, in order to fix the live image case.

Comment 34 Adam Williamson 2014-09-10 16:12:50 UTC

er, oh. and I don't think it'll cover non-live ('traditional') installs either, as libvirt is not going to be running in the anaconda environment. option b) above actually seems viable to me, if you pick a sufficiently non-likely alternative network. People are fairly predictable; I suspect those who thought ahead would be very likely to a) increment by 1 (123), b) double (244), c) bump one digit (132 or 222)...something pattern-y like that. So if you choose something completely odd - and literally odd, humans like even numbers - like 197, I suspect it'd work fine for a very large number of cases.

It does presumably depend on systemd's virt detection and thus may not help some obscure virt environments, might have bugs where the virt detection doesn't fire, but I think it'd fly for rather a lot of the cases.

Also, on 1):

"and libvirt is often started prior to dhclient acquiring the conflicting IP address (and there is no reasonable way to know beforehand if that is going to happen)"

this seems like something that should be vulnerable to attack? what's the problem with using service ordering here, so libvirt comes up after network is up? yes, it could result in a delay to libvirt init when network config is messily broken, but that doesn't seem like the end of the world.

Comment 35 Laine Stump 2014-09-10 17:55:40 UTC

I see the problem with the live image composes (which is similar to the problem Daniel points out with pre-built cloud images), but in the case of non-live installs, libvirtd does *not* need to be running during the time that anaconda is installing packages. The only assumption is that the system on which the installation is taking place has ifup'ed its network interfaces so that the "%post libvirt-daemon-config-network" section of the specfile will see the conflict when it runs a short script looking for "192.168.122.0/24" in the output of "ip route show". I've coded this up and it does work.

BTW, my option (b) above (change the address of the default network to some fixed alternate when installing in a virtual machine) would not work for the case of a live image compose unless the live image compose was itself taking place in a virtual machine (but then the address would be "wrong" for the cases where someone booted the live image on real hardware).

Bah. There simply is no silver bullet for this problem :-/

I just posted an RFC patch upstream that does what I'm suggesting, so feel free to rip it apart there as well:

https://www.redhat.com/archives/libvir-list/2014-September/msg00581.html

Comment 36 Eric Blake 2014-09-10 18:21:34 UTC

Is the live image composed in a vm or on bare metal?  Would it work to try something as simple as running virt-what as part of the rpm install script, and if bare metal use 192.168.122.0 vs. known virtual use 192.168.123.0?

Comment 37 Adam Williamson 2014-09-10 19:27:17 UTC

yeah, I realized after posting that I was wrong about traditional installs, they would work with your patch.

The live image compose environment is actually mock - a mock chroot in a koji run. This rather strongly limits the amount of messing about we can do with the live compose environment.

I was reading b) as a *run-time* thing, not an *install-time* thing, BTW - I was kinda imagining a setup where the default network's on-disk configuration doesn't specify a subnet but says something like 'default', and then at each boot the environment is checked and that's turned into 122 or 197 (or whatever) depending on the test.

Comment 38 Adam Williamson 2014-09-10 19:35:19 UTC

here's another trial balloon - what if libvirt networks were started on demand, not at boot? don't bring it up until a VM that uses it is started?

that would avoid the problem entirely for people who don't actually run any nested VMs, and for people who do, it would be much more likely that the main network stack would be up at that point, and you could check for conflicts?

Comment 39 Adam Williamson 2014-09-10 19:37:19 UTC

Discussed at 2014-09-10 freeze exception review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2014-09-10/f21-blocker-review.2014-09-10-16.07.log.txt . We agreed to punt on this one, as whether we'd accept a fix through freeze depends heavily on what the fix looks like; we'd like to reserve the ability to do the full review process after seeing any proposed fix that arrives during the freeze.

Comment 40 Jens Petersen 2014-09-11 02:55:45 UTC

(In reply to Adam Williamson (Red Hat) from comment #34)
> what's the problem with using service ordering here, so libvirt comes up
> after network is up?

I sounds like a good idea to me too.

Comment 41 Daniel Berrangé 2014-09-11 08:51:27 UTC

(In reply to Adam Williamson (Red Hat) from comment #38)
> here's another trial balloon - what if libvirt networks were started on
> demand, not at boot? don't bring it up until a VM that uses it is started?

That doesn't work with all of the libvirt hypervisors, specifically with the XenD based virt driver guests can be started directly by talking to XenD and so libvirt has no way to hook in to start the network.

Comment 42 Daniel Berrangé 2014-09-11 08:58:04 UTC

(In reply to Daniel Berrange from comment #41)
> (In reply to Adam Williamson (Red Hat) from comment #38)
> > here's another trial balloon - what if libvirt networks were started on
> > demand, not at boot? don't bring it up until a VM that uses it is started?
> 
> That doesn't work with all of the libvirt hypervisors, specifically with the
> XenD based virt driver guests can be started directly by talking to XenD and
> so libvirt has no way to hook in to start the network.

Oh and it would also be a major incompatible semantic change to the libvirt APIs for (auto-)starting networks, so that pretty much rules it out.

Comment 43 Laine Stump 2014-09-14 21:18:58 UTC

Another use case where starting networks only on demand would break existing use of libvirt networks:

Although it is making wild assumptions, gnome-boxes has a mode that uses the qemu bridge helper to connect a network interface of a session-mode libvirt to the bridge named virbr0, e.g.:

    <interface type='bridge'>
      <source bridge='virbr0'/>
      ...

It does this because that's the only way a qemu instance started by unprivileged libvirt can have any kind of network connection other than qemu usermode networking. So a process with no ability to autostart the network is assuming that the network is already started.

Comment 44 Laine Stump 2014-09-14 21:52:28 UTC

Since a couple of people on libvir-list agreed that it would cause more good than harm,

V1: https://www.redhat.com/archives/libvir-list/2014-September/msg00581.html
V2: https://www.redhat.com/archives/libvir-list/2014-September/msg00797.html

 I just pushed the patch I discussed above to libvirt upstream:

http://libvirt.org/git/?p=libvirt.git;a=commit;h=5f71959667e4902d738a849e7c9391e794fccf22

Because Kashyap provided personal confirmation that some people are already pre-setting their default networks to 192.168.124.0/24 in the L1 (or is it L0?) host, I start scanning for an open subnet at 192.168.124.0/24.

Should we add that patch to the fedora rawhide (and F21) libvirt builds to see how it fares in the field?

This still leaves the problem of the Fedora Live images (and other similar pre-built images that are constructed in a different environment than that in which they eventually run). Since the live image compose is done in a minimal "mock" environment that can't have the libvirt default network turned on (or I guess even have a dummy static route added), we need to do something else either at runtime or install time. So far the thing that sounds the least damaging to other scenarios is to make libvirtd start depend on networking being completely *up* (not just started). It seems like someone may have had a reason why this shouldn't/couldn't be done though, can someone remind me of that?

Comment 45 Kashyap Chamarthy 2014-09-15 11:12:32 UTC

(In reply to Laine Stump from comment #44)
> Since a couple of people on libvir-list agreed that it would cause more good
> than harm,
> 
> V1: https://www.redhat.com/archives/libvir-list/2014-September/msg00581.html
> V2: https://www.redhat.com/archives/libvir-list/2014-September/msg00797.html
> 
>  I just pushed the patch I discussed above to libvirt upstream:
> 
> http://libvirt.org/git/?p=libvirt.git;a=commit;
> h=5f71959667e4902d738a849e7c9391e794fccf22
> 
> Because Kashyap provided personal confirmation that some people are already
> pre-setting their default networks to 192.168.124.0/24 in the L1 (or is it
> L0?) host, 

It's on L0 (on physical host) - so that I could avoid the conflit on L1 (the guest hypervisor).  And if L0 is already running w/ the default libvirt network, then I change the 'default' network in L1.

> I start scanning for an open subnet at 192.168.124.0/24.
> 
> Should we add that patch to the fedora rawhide (and F21) libvirt builds to
> see how it fares in the field?

I just pulled the latest libvirt git master and built local RPMs and doing a little test with KVM-based nested virtualization. Result upcoming. . .

Comment 46 Kashyap Chamarthy 2014-09-15 12:25:43 UTC

I just pulled the latest libvirt master and built RPMs[*] for F20 and
did the below test with nested virtualization:

0. git commit info:

    $ git log | head -1
    commit d00c6fd25854bfd4822f6ce3d769a8ca132ec31b

1. L0 (running Fedora-21) has a Fedora-20 guest (called: 'guest-hyp')
   booted with default libvirt network:

   Route info for libvirt bridge(s) on L0:

    $ ip route show | grep virbr
    192.168.122.0/24 dev virbr0  proto kernel  scope link  src 192.168.122.1


3. Install the lastest libvirt RPMs (built from git which has the commit
   mentioned in comment #44) in L1 ('guest-hyp'); start libvirtd and
   enumerate IP address and route info:

    $ ip a | grep ens2
    2: ens2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    inet 192.168.122.62/24 scope global dynamic ens2

    $ ip route show
    default via 192.168.122.1 dev ens2  proto static  metric 1024 
    192.168.122.0/24 dev ens2  proto kernel  scope link  src 192.168.122.62 

3.1 Examine the default network info in L1 (guest-hyp):

    $ virsh net-dumpxml --inactive default
    <network>
      <name>default</name>
      <uuid>c5344f60-7568-4458-860f-e1808ddb2c85</uuid>
      <forward mode='nat'/>
      <bridge name='virbr0' stp='on' delay='0'/>
      <ip address='192.168.122.1' netmask='255.255.255.0'>
        <dhcp>
          <range start='192.168.122.2' end='192.168.122.254'/>
        </dhcp>
      </ip>
    </network>


Shouldn't it have replaced the 122 with 124 in the above range? Or
I missed something here?

That's the Koji scratch build I made with an SRPM built from current git
(only change I made to the SPEC - comment out wireshark packages), if
someone wants to test:

    [*] http://koji.fedoraproject.org/koji/taskinfo?taskID=7578232

Comment 47 Kashyap Chamarthy 2014-09-15 12:52:16 UTC

(In reply to Kashyap Chamarthy from comment #46)

> That's the Koji scratch build I made with an SRPM built from current git
> (only change I made to the SPEC - comment out wireshark packages), if
> someone wants to test:
> 
>     [*] http://koji.fedoraproject.org/koji/taskinfo?taskID=7578232

Sorry, the above scratch build failed due to ARM build failure, but it built locally for x86_64, so posting the the SRPM I used to build the libvirt RPMs instead:

    https://kashyapc.fedorapeople.org/temp/libvirt-1.2.9-1.fc20.src.rpm

Comment 48 Kashyap Chamarthy 2014-09-15 15:39:57 UTC

A small update from testing.

From a conversation with Lain on IRC, the SPEC file was missing the below change.
===============================================
diff --git a/libvirt.spec.in b/libvirt.spec.in
index c2e2be4..bec3a50 100644
--- a/libvirt.spec.in
+++ b/libvirt.spec.in
@@ -1737,7 +1737,7 @@ if test $1 -eq 1 && test ! -f %{_sysconfdir}/libvirt/qemu/networks/default.xml ;
     sub=${orig_sub}
     nl='
 '
-    routes="${nl}$(ip route show | cut -d' ' -f1)"
+    routes="${nl}$(ip route show | cut -d' ' -f1)${nl}"
     case ${routes} in
       *"${nl}192.168.${orig_sub}.0/24${nl}"*)
         # there was a match, so we need to look for an unused subnet
===============================================

So, with the above new fix, I tested again: (a) Rebuilt RPMs with this change (b) Removed all old libvirt RPMs from a guest, remove all the instances of 'default.xml' from /etc/libvirt/qemu/networks (c) Install newly rebuilt RPMs (d) Start libvirtd 

Result: Despite having an existing route with the default subnet 192.168.122.0/24, 'default.xml' remains unchanged with a new value.

I wonder what else is missing here.

Comment 49 Kashyap Chamarthy 2014-09-15 17:38:37 UTC

Sorry, it turned out to be human error (too many virtual machines).

The fix in comment #48 works just fine: noticing the existing route of
192.168.122.0/24, the libvirt default network is now created on the next
free range (starting with 192.168.124.0/24).

Enumerate the route info in the L1 guest:

  $ ip route show
  default via 192.168.122.1 dev ens2  proto static  metric 1024 
  192.168.122.0/24 dev ens2  proto kernel  scope link  src 192.168.122.62 
  192.168.124.0/24 dev virbr0  proto kernel  scope link  src 192.168.124.1

Examine the default libvirt network contents on L1 guest:

  $ virsh net-dumpxml default
  <network>
    <name>default</name>
    <uuid>803a6122-a6f8-4ca5-8a94-32f4f0dfb697</uuid>
    <forward mode='nat'>
      <nat>
        <port start='1024' end='65535'/>
      </nat>
    </forward>
    <bridge name='virbr0' stp='on' delay='0'/>
    <ip address='192.168.124.1' netmask='255.255.255.0'>
      <dhcp>
        <range start='192.168.124.2' end='192.168.124.254'/>
      </dhcp>
    </ip>
  </network>


So:

  Tested-By: Kashyap Chamarthy <kchamart>

Comment 50 Laine Stump 2014-09-15 18:47:35 UTC

Okay, I've pushed the fix that Kashyap tested above:

commit 22048ae61dbb7876d17bcf7dbedf9e8d1cf98d4e
Author: Laine Stump <laine>
Date:   Mon Sep 15 13:30:08 2014 -0400

    network: detect conflicting route even if it is the final entry
    
    This is a followup to commit 5f719596, which checks for a route
    conflicting with the standard libvirt default network subnet
    (192.168.122.0/24). It turns out that $() strips the trailing newline
    from the output of "ip route show", so there would be no match if the
    route we were looking for was the final line of output. This can be
    solved by adding ${nl} to the end of the output (just as we were
    already adding it at the beginning of the output).

So this patch, along with commit 5f719596, would need to be backported to get the proper fix.

Thanks for the testing/debugging help, Kashyap!

Comment 51 Fedora Update System 2014-09-15 19:57:56 UTC

libvirt-1.2.8-2.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/libvirt-1.2.8-2.fc21

Comment 52 Fedora Update System 2014-09-16 18:43:04 UTC

Package libvirt-1.2.8-2.fc21:
* should fix your issue,
* was pushed to the Fedora 21 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing libvirt-1.2.8-2.fc21'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-10844/libvirt-1.2.8-2.fc21
then log in and leave karma (feedback).

Comment 53 Fedora Update System 2014-09-17 16:09:52 UTC

libvirt-1.2.8-3.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/libvirt-1.2.8-3.fc21

Comment 54 Fedora Update System 2014-09-18 20:49:24 UTC

libvirt-1.2.8-4.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/libvirt-1.2.8-4.fc21

Comment 55 Fedora Update System 2014-09-23 04:20:47 UTC

libvirt-1.2.8-4.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 56 Cole Robinson 2014-09-24 22:55:25 UTC

FWIW, since the issue still persists in some cases, I opened a new bug to track the issue upstream, with a summary of bits up until now. 

https://bugzilla.redhat.com/show_bug.cgi?id=1146284

Note You need to log in before you can comment on or make changes to this bug.

agedosier
apevec
awilliam
berrange
clalancette
crobinso
dougsland
duffy
eblake
eglynn
elad
itamar
jforbes
kay
kchamart
kevin
kparal
laine
libvirt-maint
lpeer
markmc
mattdm
mclasen
mishu
mschmidt
petersen
pnemade
rbalakri
rbergero
ricardo.arguello
robatino
satellitgo
sergio.pasra
shyu
tflink
veillard
virt-maint
walters