Bug 1146232 - no VM networking; 'default' network in the VM conflicts with 'default' network on the host
Summary: no VM networking; 'default' network in the VM conflicts with 'default' networ...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: libvirt
Version: 27
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Libvirt Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: https://fedoraproject.org/wiki/Common...
: 1208581 1321703 1370906 1432756 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-09-24 20:08 UTC by M. Edward (Ed) Borasky
Modified: 2019-10-17 08:36 UTC (History)
31 users (show)

Fixed In Version: libvirt-3.2.1-3.fc26
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-30 20:34:14 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
screenshot of systemctl -l status libvirtd (116.25 KB, image/png)
2014-09-24 20:39 UTC, M. Edward (Ed) Borasky
no flags Details
Log of yum history after update (91.90 KB, text/plain)
2014-09-24 22:09 UTC, M. Edward (Ed) Borasky
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 811967 0 unspecified CLOSED libvirt in a VM often brings up 'default' network when it shouldn't, kills vm networking 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1146284 0 unspecified CLOSED libvirt in a VM often brings up 'default' network when it shouldn't, kills vm networking (upstream) 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1164492 0 unspecified CLOSED Please drop libvirt 'default' network dependency for F28 GA (also Beta?), disrupts livecd networking 2021-02-22 00:41:40 UTC

Internal Links: 1146284 1164492

Description M. Edward (Ed) Borasky 2014-09-24 20:08:15 UTC
Description of problem: I installed Fedora 21 and the Virtualization group as described for "Virtualization Test Day" (https://fedoraproject.org/wiki/Test_Day:2014-09-25_Virtualization). I started Virtual Machine Manager and went through the VM creation process using the Fedora 21 Alpha Workstation ISO and the default NAT networking. The guest boots fine but in the upper right the network status is a question mark. I opened a terminal and "ping fedoraproject.org" responded "ping: unknown host fedoraproject.org". The "Network Settings" dialog shows a DNS of 192.168.122.1, which is what I usually see for the default NAT connection with Fedora 20. I can ping 8.8.8.8 but adding 'nameserver 8.8.8.8' doesn't give me full network capability inside the guest. Host networking is fine - I'm posting this bug report from the host.


Version-Release number of selected component (if applicable): virt-manager.noarch        1.1.0-1.fc21             @updates-testing


How reproducible: See above - install virt-manager and use the default NAT networking to create a guest Fedora 21 machine.

Actual results: non-functioning guest NAT networking


Expected results: functioning NAT networking

Comment 1 Cole Robinson 2014-09-24 20:13:57 UTC
Thanks for trying the test cases!

Is libvirtd installed inside the guest? What's the output of sudo virsh net-list --all inside the VM?

Comment 2 M. Edward (Ed) Borasky 2014-09-24 20:37:28 UTC
(In reply to Cole Robinson from comment #1)
> Thanks for trying the test cases!
> 
> Is libvirtd installed inside the guest? What's the output of sudo virsh
> net-list --all inside the VM?

I haven't installed the workstation into the VM yet - this is running from the LiveDVD iso file. I haven't been able to copy out of the VM but I do have a screenshot (attachment coming). libvirtd is installed and active and virsh net-list all shows Name=default, State=active, Autostart=yes and Persistent=yes

Comment 3 M. Edward (Ed) Borasky 2014-09-24 20:39:27 UTC
Created attachment 940905 [details]
screenshot of systemctl -l status libvirtd

I'll try to capture this to a file if you can't read it from the image

Comment 4 Cole Robinson 2014-09-24 20:47:55 UTC
Ahh that explains it. We are hitting bits related to this issue again:

https://bugzilla.redhat.com/show_bug.cgi?id=811967

I believe the Workstations Live CD pulls in libvirt's networking by default, which is conflicting with the VMs route. If you actually install packages into the guest, the libvirt RPM should avoid this at install time, but the livecd image isn't smart enough to do that.

You should be able to restore networking with something inside the VM like: sudo virsh net-destroy default && sudo virsh net-undefine default. Then toggle networking in NetworkManager

Comment 5 M. Edward (Ed) Borasky 2014-09-24 21:01:22 UTC
As soon as I typed "sudo virsh net-destroy default" in the live VM, Network Manager showed a working wired connection! And 'ping fedoraproject.org' works. But you should be able to create a VM, boot a Live ISO in it and have working NAT networking. So this is a bug against something other than Virtual Machine Manager. I'm going to go ahead and do the install to see if the installed machine has working NAT.

Comment 6 Cole Robinson 2014-09-24 21:03:53 UTC
I agree, it definitely should not be like that. I'll follow up in a bit and likely reassign this bug.

Comment 7 M. Edward (Ed) Borasky 2014-09-24 21:17:31 UTC
Install is done and the installed guest shows the same symptoms as the live DVD machine did. 'virsh net-destroy default' gives me a working network. I've got log files if you need them - USB stick mounts on the guest just fine.

Comment 8 M. Edward (Ed) Borasky 2014-09-24 22:09:53 UTC
Created attachment 940936 [details]
Log of yum history after update

I booted the VM, did a 'sudo virsh net-destroy default' to connect to the Internet. Then I ran 'yum update' and rebooted. Many packages were updated, including NetworkManager and the virtualization libraries. After the reboot I ran 'yum history info 2' to get this log. The problem is still there after updating to the latest packages!

Comment 9 Kamil Páral 2014-09-25 12:29:12 UTC
I'm transferring Adam's Beta Blocker nomination from bug 811967 comment 13 into this bug. According to bug 811967 comment 56 and bug 1146284, this problem has been resolved in an installed system, but still persists on LiveCDs. It might still be violating that criteria (broken network in a LiveCD by default is quite a problem), although this might also be seen as not critical enough and re-proposed for Final. Adding for discussion.

Comment 10 Cole Robinson 2014-09-25 12:33:54 UTC
The 'easy' solution is to revert gnome-boxes dep on libvirt-daemon-config-network (bug 1081762), as it was in F20 and earlier. boxes functions without it, just with the crappy usermode networking.

That said I'm going to look into why the original mitigation patch for libvirt isn't working anymore, there might be something obvious we are missing.

Comment 11 Laine Stump 2014-09-25 17:50:12 UTC
My guess is that it is still "working", but the situation here is one that it wasn't expected to fix.

The failure of networking in a LiveCD that is booted in a virtual machine is unfortunately not fixed with the patch I pushed for Bug 811967; that was pointed out at the time.

As for a regular Workstation installation having non-working networking; I'm guessing you made this installation by booting the LiveCD, destroying libvirt's default network to get the network up, then clicking on the "install Fedora to disk" button (or whatever it's called). Is it possible that installing from a LiveCD boot makes a shortcut of transferring the existing LiveCD system over to the disk rather than installing everything from scratch? Or maybe the network wasn't up during the time you did the installation? If the running OS has an interface with an address on 192.168.122.0/24 at the time the libvirt packages are installed, the network address of libvirt's default network will be changed to avoid conflict.

Even if this explains away the problem, we still need to think of something to give us a LiveCD that *always* has proper networking.

Comment 12 M. Edward (Ed) Borasky 2014-09-25 21:20:02 UTC
(In reply to Cole Robinson from comment #10)
> The 'easy' solution is to revert gnome-boxes dep on
> libvirt-daemon-config-network (bug 1081762), as it was in F20 and earlier.
> boxes functions without it, just with the crappy usermode networking.
> 
> That said I'm going to look into why the original mitigation patch for
> libvirt isn't working anymore, there might be something obvious we are
> missing.

FWIW I think Fedora Workstation should ship with Virtual Machine Manager as well as or even in place of Boxes. All the "guts" are already installed because of Boxes; you might as well have the professional user interface too. ;-)

Comment 13 Scott Dowdle 2014-09-25 21:30:33 UTC
@12 - That stuff is also installed by default in the 32-bit build... which I don't think is worth much and only causes trouble.  That stuff should be yanked from the 32-bit build.

Comment 14 Zeeshan Ali 2014-10-02 16:32:41 UTC
(In reply to M. Edward (Ed) Borasky from comment #12)
> (In reply to Cole Robinson from comment #10)
> > The 'easy' solution is to revert gnome-boxes dep on
> > libvirt-daemon-config-network (bug 1081762), as it was in F20 and earlier.
> > boxes functions without it, just with the crappy usermode networking.
> > 
> > That said I'm going to look into why the original mitigation patch for
> > libvirt isn't working anymore, there might be something obvious we are
> > missing.
> 
> FWIW

Not really worth anything really without any arguments at all. :)

> I think Fedora Workstation should ship with Virtual Machine Manager as
> well as or even in place of Boxes. All the "guts" are already installed
> because of Boxes; you might as well have the professional user interface
> too. ;-)

Its more "professional" becuase it exposes *all* libvirt configuration and gives you way to shoot yourself in the foot? I don't agree with that definition of "professional". I understand that virt-manager has its place but not on every workstation, where you typically need it for working around some bugs, all of which can be achieved with virsh.

So I think we are already shipping everything most workstation users will need: An impressive GUI that makes VM hanling very easy and a commandline tool for a rainy day.

Comment 15 Adam Williamson 2014-10-03 16:44:40 UTC
Laine: "Is it possible that installing from a LiveCD boot makes a shortcut of transferring the existing LiveCD system over to the disk rather than installing everything from scratch?"

Yes, that's exactly what it does. There are no 'packages' within the live image, basically. The live image creation process installs a bunch of packages into a chroot (well, a mock chroot) and then 'freezes' the contents of that chroot as a filesystem image, basically. That's the information the live image contains. It knows what packages are 'installed' into it, but it cannot then 'install' those packages anywhere else, because it doesn't have actual RPM packages to install. When the live image gets 'installed', the whole filesystem image is simply transferred to the target disk. (That's why live install is rather fast compared to a DVD install).

Comment 16 Adam Williamson 2014-10-03 16:59:10 UTC
Discussed at 2014-10-03 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2014-10-03/f21-blocker-review.2014-10-03-15.58.log.txt . Our current assessment is that this doesn't seem to be happening often enough to block Beta (it seems to be happening much less frequently than it did early in the f21 cycle, for some reason - more like the frequency with which it occurred in f19 and f20). So for Beta we think it's OK to document it and the workaround on CommonBugs. But we believe it should block Final release as a partial violation of Beta criterion "The release must be able host virtual guest instances of the same release." - it's considered to be rather broken if networking doesn't work.

If a fix arrives during Beta Freeze we may consider breaking freeze for it, depending on the point in the cycle and the nature of teh fix, so please nominate it as a freeze exception issue if that occurs.

Comment 17 Laine Stump 2014-10-03 19:27:25 UTC
(In reply to Adam Williamson (Red Hat) from comment #15)
> 
> Yes, that's exactly what it does. There are no 'packages' within the live
> image, basically. The live image creation process installs a bunch of
> packages into a chroot (well, a mock chroot) and then 'freezes' the contents
> of that chroot as a filesystem image, basically. That's the information the
> live image contains. It knows what packages are 'installed' into it, but it
> cannot then 'install' those packages anywhere else, because it doesn't have
> actual RPM packages to install. When the live image gets 'installed', the
> whole filesystem image is simply transferred to the target disk. (That's why
> live install is rather fast compared to a DVD install).

Can I combine both "ARRRRGHHH!!!" and "Cool!" in the same reply? :-) It's an impressive time/space saver that it does that, but it also increases the number of people that my most recent fix won't help :-(

Thinking out loud now:

I'm beginning to think that, as ugly as it may be, we may need to have some sort of "uber-default" network mode - a knob, turned off by default, that does the following:

1) when a network with uberDefault='yes' is started, it will wait to actually setup the bridge until it detects at least one working interface with an IP address (i.e. setting this attribute will eliminate the race between libvirt and NetworkManager, but if that flag *isn't* set, everything will be asynchronous as it is now, thus preventing delays and surprises for normal users). This wait will have some reasonably short timeout to prevent a total lockup on any host that really doesn't have any physical network connection.

2) Once other networking is up (or the timeout occurs), if a route/interface is found, rather than just refusing to start the network (as is done now), a search can be made for an alternate subnet within some configurable range.

The default network configuration that is included in the libvirt-daemon-config-network package would then be changed to have uberDefault='yes'.

I know that sounds kludgey (because it *is*), but we haven't found a straightforward solution.

How ridiculous does this sound? (realizing that the name of the attribute won't really be "uberDefault" but something more professional sounding, and that there will also be a setting somewhere for the range of subnets to try)

Comment 18 Kamil Páral 2014-10-06 08:35:54 UTC
I don't know whether it is relevant, but just a note - AFAIK just the original LiveCD filesystem is transferred to the installed system, without any changes performed during LiveCD run (those are written to an overlay filesystem, which is not transferred).

Comment 19 Kamil Páral 2014-10-06 12:40:48 UTC
(In reply to Kamil Páral from comment #18)
> I don't know whether it is relevant, but just a note - AFAIK just the
> original LiveCD filesystem is transferred to the installed system, without
> any changes performed during LiveCD run (those are written to an overlay
> filesystem, which is not transferred).

I just talked to Vratislav Podzimek from Anaconda and he says NetworkManager scripts are probably are an exception to this - during post-install system configuration, the current configuration is written to the installed system. Therefore, any networking changes done to the live environment will probably affect the installed system.

Comment 20 Adam Williamson 2014-10-16 02:32:03 UTC
well, it depends on exactly what the changes are and exactly what anaconda writes to the installed system.

laine: well, I mean, I've heard of *worse* kludges. I may have written some. :)

Comment 21 M. Edward (Ed) Borasky 2014-10-30 06:03:36 UTC
I'm currently testing with Fedora 21 workstation host and a guest of Fedora-Live-Workstation-x86_64-21_Beta-4.iso. This problem is still there - host is now libvirt-daemon-driver-network.x86_64   1.2.9-3.fc21 @koji-override-0/$releasever. 'sudo virsh net-destroy default' still works to gain network connectivity.

Comment 22 M. Edward (Ed) Borasky 2014-10-30 06:19:31 UTC
I just installed to a VM. I had to do 'sudo virsh net-destroy default' in the installed VM to get networking. :-(

Comment 23 Adam Williamson 2014-10-30 07:20:48 UTC
It's known to be still an issue with live images I think, it's really not straightforward to solve (see all above discussion). I see it fairly infrequently at present, seems like once we get off debug kernels it happens less.

Comment 24 Cole Robinson 2014-11-15 18:47:50 UTC
Interestingly, with the copy of Fedora-Live-Workstation-x86_64-21_Beta-4.iso I have (from official torrent), /etc/libvirt/qemu/networks/default.xml uses 192.168.124.*, and the ctime is for Oct 29. So since IIRC the livecd images are created inside a libvirt+kvm VM, it must have detected the conflicting network and saved a different address range.

However we will still hit issues with people who install their host with F21 livecd, then use the same media to install guests, since they will both be using the 192.168.124 range. Maybe it will still hit less often as Adam suggests.

Comment 25 Cole Robinson 2014-11-15 19:08:04 UTC
When the host default network and the guest default network have the same range, this still seems to hit every time (up to date F21 host, F21 beta livecd guest). So we are still going to hit this in practice for people who install both host and guest from f21 livecd media

Comment 26 Cole Robinson 2014-11-20 15:42:18 UTC
I opened a bug against boxes to temporarily drop the libvirt-daemon-config-network dep for F21 GA:

https://bugzilla.redhat.com/show_bug.cgi?id=1164492

I think that's the only fix we can manage at this point

Comment 27 M. Edward (Ed) Borasky 2014-11-22 22:40:00 UTC
Fixed in TC3!!

Comment 28 Petr Schindler 2014-11-24 12:45:31 UTC
I tested the fix with TC3 and it really solves the problem. Network is working in vm.

Comment 29 Adam Williamson 2014-11-24 20:07:16 UTC
Looking at this again, I think it makes most sense to consider #1164492 the blocker here - we don't necessarily want to close this bug when that one's fixed, I don't think, so the dependency relationship seems not the best way to express things. I think it makes most sense to have #1164492 as the blocker, and this bug open but not blocking. Does that make sense to everyone? If not, yell and we can rejig it again.

Tried to revise the summary to be accurate to the current situation.

Comment 30 Cole Robinson 2015-04-02 17:18:33 UTC
Still relevant for f22, where this is going to bite us more often since we are in devel timeframe and 1043129 was reverted, so changing version.

Comment 31 Cole Robinson 2015-04-02 17:28:58 UTC
*** Bug 1208581 has been marked as a duplicate of this bug. ***

Comment 32 Cole Robinson 2015-04-02 17:29:48 UTC
Er wrong bug in Comment #30, I meant the boxes issue bug 1164492

Comment 33 Cole Robinson 2015-04-02 17:38:23 UTC
I reopened bug 1164492 and proposed that as a blocker since it's the easiest fix in the short term.

On IRC laine said he will look at extending libvirt with his suggestion Comment #17, so hopefully we can get that in before GA

Comment 34 Adam Williamson 2015-04-06 16:56:10 UTC
Discussed at 2015-04-06 blocker review meeting: https://meetbot.fedoraproject.org/fedora-blocker-review/2015-04-06/f22-blocker-review.2015-04-06-16.00.log.txt . We agreed that 1164492 is sufficient for 22 Beta, this bug does not itself need to be a Beta blocker.

Comment 35 Matthias Clasen 2015-04-06 23:06:32 UTC
this is the second instance I'm aware of where 'temporary fix in the run-up to ga' comes back the next release, because no progress was made on the actual fix (the other instance I remember involved /etc/resolv.conf, systemd and anaconda).

Adam, can we figure out some process (involving qa, perhaps) to keep this from reoccurring ?

Comment 36 Adam Williamson 2015-04-06 23:16:38 UTC
I don't believe we ever said this would be 'temporary' in the sense of 'one release only'; it was rather 'temporary' in the sense of 'only to be done for releases'. I think we explicitly *anticipated* that it might be necessary to do it for several releases.

This is a fundamentally difficult situation to 'fix', I'm afraid. There's much more detail on it from the virt folks in the parent bug.

Comment 37 Eric Blake 2015-04-16 21:06:06 UTC
Capturing some ideas from F22 fedora-test-day IRC:

[14:48]	sgallagh	so you're looking for "only runs once, ever"?
[14:48]	eblake_	but something where installing libvirt makes the first boot slow enough to ensure we avoid conflicts, but then rewrites itself so that subsequent boots drop the dependency
[14:49]	eblake_	so the first boot runs only once
[14:49]	sgallagh	For that, I'd use a oneshot that just checked for the presence of a stamp file somewhere and make that run Before=libvirtd.service
[14:49]	eblake_	if I understand, the problem we are trying to avoid is having libvirt always depend on networking
[14:49]	eblake_	but depending on networking for one time only will let us avoid the subnet collision
[14:50]	sgallagh	If the stamp file exists, then have it no-op. If it doesn't exist, have it do whatever setup you need, then let libvirt start
[14:50]	eblake_	just idle ideas, since this is the second release in a row where we've had to cripple boxes to avoid a live image that has a conflicting subnet installation by default
[14:51]	sgallagh	But that's still not *perfect* if all you really want is a one-time ordering
[14:51]	sgallagh	Yeah, I've been following that bug
[14:52]	sgallagh	What I suggested above still might work, though.
[14:52]	sgallagh	You write a script that sleeps and runs 'systemctl is-active network.target' every second if the stampfile doesn't exist.
[14:53]	sgallagh	Once network.target is active, write the stampfile and exit success.
[14:53]	sgallagh	eblake_: It's a bit hacky, but it could work

If I understand the issue correctly, we want to avoid a dependency on networking in the common case, but require a dependency on networking when picking a non-conflicting subnet.  The livecd problem is that the subnet choice was already baked into the livecd with no dependency on networking, so installing it as a host and then trying to run a livecd as guest both try to use the same subnet. But with the stamp approach, the livecd would exist without the stamp file, and thus every boot of a livecd (at least, a livecd without persistent storage) will wait for networking to come up, and thus pick a safe subnet.  Even if the host side then installs from the livecd and bakes in one choice (because the host side livecd created a stamp file), the guest livecd will still pick a safe subnet. Meanwhile, persistent images won't be waiting for networking to come up, because the installed subnet and stamp file are enough to assume that the pre-selected subnet is safe for that installation.

Comment 38 Kamil Páral 2015-04-17 08:58:18 UTC
Not sure if this is helpful or not, but our LiveCDs contain an init script called livesys (and livesys-late), which is executed only during a LiveCD boot, but not during an installed system boot. You can use that script to create a one-shot timestamps or other adjustments of the filesystem, which will *not* be transferred to the installed system. You can also modify systemd services and more. The script part of the official kickstart:

https://git.fedorahosted.org/cgit/spin-kickstarts.git/tree/fedora-live-base.ks?h=f22
(search for 'livesys')

Comment 39 Cole Robinson 2016-04-08 22:07:50 UTC
*** Bug 1321703 has been marked as a duplicate of this bug. ***

Comment 40 Adam Williamson 2016-04-12 18:33:16 UTC
Yeah, this appears to happen in F24 lives too, in fact there's a funny overlap with https://bugzilla.redhat.com/show_bug.cgi?id=1262556 ... when I boot a laptop with only wifi and the virbr0 interface / route appears, that anaconda crash doesn't happen; if the virbr0 interface / router doesn't appear, or I take it down, that crash happens...

Comment 41 Adam Williamson 2016-04-12 18:34:47 UTC
actually, wait, this doesn't really entirely describe that at all, I got my 'weird live network stuff with libvirt' wires crossed. probably best just ignore that comment.

Comment 42 Chris Murphy 2016-05-04 19:59:50 UTC
*sigh* I mean to put this in this bug rather than the closed one.

I think libvirtd needs to be disabled by default unless/until there's a better way to handle it. It's completely annoying to have this broken in VM's out of the box and fixable with a non-obvious solution. I don't see any advantage to having libvirtd enabled by default, only a downside.

Comment 43 Cole Robinson 2016-05-04 20:54:13 UTC
libvirtd is started by default because certain features like VM autostart are dependent on it

The issue isn't strictly libvirtd running by default, it's libvirtd autostarting the 'default' network. The only reason the network config via the libvirt-daemon-config-network package is present on a default Fedora install is due to gnome-boxes rpm Requires on it.

gnome-boxes doesn't even _need_ that networking mode, it made do without it for a couple years, but the 'default' config gives a much better networking setup than usermode. Regardless, IMO the RPM Requires is too heavy a hammer. virt-manager offers to install it via PackageKit on first run, but we don't Require it because we don't strictly need it and it causes these types of conflicts.

Maybe now with RPM weak dependencies gnome-boxes can set Suggests: libvirt-daemon-config-network or similar?

Comment 44 Chris Murphy 2016-05-04 21:14:46 UTC
(In reply to Cole Robinson from comment #43)
> libvirtd is started by default because certain features like VM autostart
> are dependent on it

These features are used out of the box with no configuration required? Seems fair anyone using features that need some configuration will know libvirtd needs enabling. 

> The issue isn't strictly libvirtd running by default, it's libvirtd
> autostarting the 'default' network. The only reason the network config via
> the libvirt-daemon-config-network package is present on a default Fedora
> install is due to gnome-boxes rpm Requires on it.

That's ironic.

Comment 45 Adam Williamson 2016-05-04 21:17:27 UTC
That's almost the same as dropping the dep, though, because ultimately the expected workflow is that you install Workstation and run Boxes. If the dep is only a Suggests:, I don't think the dep will be installed after either a live image or an installer deployment of Workstation.

Not that that means it's the wrong thing to do, just pointing it out.

It *would* be possible to tell libvirt not to start up if we're running live *and* we're running inside one of the virt types systemd recognizes, I believe:

|!ConditionVirtualization=1
|!ConditionKernelCommandLine=rd.live

if I'm reading the docs ('man system.unit') correctly, that means "start up unless we're *both* virtualized *and* a live environment", using 'rd.live on the cmdline' as a proxy for live environment. Haven't tested this yet.

Comment 46 Cole Robinson 2016-05-04 21:38:12 UTC
Hmm that's an interesting idea. If someone tests it and confirms it does what we want, I can add it to the package.

Comment 47 Cole Robinson 2016-05-04 21:43:33 UTC
Actually, aren't the livecds composed with kickstarts or similar? Can we just add something to the workstation livecd %post config to do 'systemctl disable libvirtd' ? Seems simpler than having the libvirt package carry a non-upstream patch for an indeterminate amount of time

Comment 48 Adam Williamson 2016-05-04 21:48:21 UTC
The problem with that is that it would be transferred to the installed system, which I'm presuming we don't want. It would also apply to bare metal, which I think we also don't want.

Comment 49 Adam Williamson 2016-05-04 21:50:27 UTC
The mechanism we have for making changes specific to the live environment are the two shonky scripts, 'livesys' and 'livesys-late', which only run when booted live. But I don't think it's possible to have one of those turn off another service, particularly a systemd service, and I also believe we can't control ordering of sysv services (which those are) relative to systemd services...

Comment 50 Chris Murphy 2016-05-04 22:09:51 UTC
(In reply to Adam Williamson from comment #45)
> |!ConditionVirtualization=1
> |!ConditionKernelCommandLine=rd.live


I tried ConditionVirtualization=0 and this does fix the problem. systemctl status libvirtd

Condition: start condition failed at blahblah time
           ConditionVirtualization=0 was not met

Comment 51 Cole Robinson 2016-05-04 22:24:33 UTC
I reopend the boxes bug, requesting the dep is temporarily dropped until after GA, which is what we did in the past:

https://bugzilla.redhat.com/show_bug.cgi?id=1164492

Comment 52 Adam Williamson 2016-05-04 22:41:16 UTC
so you don't want to try the systemd thing after all?

Comment 53 Cole Robinson 2016-05-04 23:58:37 UTC
(In reply to Adam Williamson from comment #52)
> so you don't want to try the systemd thing after all?

not really, it sounds simple in theory but I'd rather go with the known quantity (reverting the dep) than introducing something new and untested at this point

Comment 54 Adam Williamson 2016-05-05 00:20:36 UTC
coward!

Comment 55 Paul W. Frields 2016-05-25 14:11:59 UTC
Can this ConditionVirtualization fix make it into Rawhide now, so we can (1) start getting some early testing, and (2) avoid doing the same dance again for F25?

Comment 56 Cole Robinson 2016-06-09 11:29:15 UTC
(In reply to Paul W. Frields from comment #55)
> Can this ConditionVirtualization fix make it into Rawhide now, so we can (1)
> start getting some early testing, and (2) avoid doing the same dance again
> for F25?

yes I'll push it this week or next

Comment 57 Cole Robinson 2016-06-23 20:25:34 UTC
I haven't pushed the unit file workaround yet, haven't found the time for proper testing...

Comment 58 Laine Stump 2016-08-29 19:15:04 UTC
*** Bug 1370906 has been marked as a duplicate of this bug. ***

Comment 59 Paul W. Frields 2016-10-04 18:32:45 UTC
Cole, we're nearing F25 Beta and don't want to get in this soup again if we can avoid it.  Could we get you to look at this unit file change again please?

Comment 60 Cole Robinson 2016-10-05 16:16:54 UTC
FWIW it appears the problematic dep has been disabled in gnome-boxes.spec since may in f24/f25/rawhide branches. Have there been reports of this issue appearing with F25 live cd composes?

Comment 61 Cole Robinson 2017-03-16 18:14:17 UTC
*** Bug 1432756 has been marked as a duplicate of this bug. ***

Comment 62 Matthew Miller 2017-03-22 14:20:21 UTC
Note duplicate bug #1432756 is confirmation that this is still a problem on F25/F26

Comment 63 Jan Kurik 2017-03-23 16:16:02 UTC
This bug is tracked as Prioritized Bug: https://fedoraproject.org/wiki/Fedora_Program_Management/Prioritized_bugs_and_issues

Comment 64 Joachim Frieben 2017-03-27 12:58:27 UTC
As of current Fedora 26, networking is working correctly again both for virtual machines based upon F25-WORK-x86_64-20170318 live media either installed (before) under Fedora 25 or installed at present under Fedora 26.
It seems that the only relevant update of Fedora 26 since the date when duplicate bug 1432756 was reported is qemu-kvm-2.9.0-0.1.rc1.fc26 compared to qemu-kvm-2.8.0-2.fc26 (indeed using back then the also different previous live media F25-WORK-x86_64-20170228).
I start wondering whether the relevant component can be safely assumed to be libvirt.

Comment 65 Jan Kurik 2017-05-25 08:48:52 UTC
I would like to ask for a status update. Is there any progress on this bug ?
This bug is on the list of Prioritized bugs https://fedoraproject.org/wiki/Fedora_Program_Management/Prioritized_bugs_and_issues and a fix is considered as important for Fedora users.

Thanks,
Jan

Comment 66 Cole Robinson 2017-05-25 13:32:08 UTC
(In reply to Jan Kurik from comment #65)
> I would like to ask for a status update. Is there any progress on this bug ?
> This bug is on the list of Prioritized bugs
> https://fedoraproject.org/wiki/Fedora_Program_Management/
> Prioritized_bugs_and_issues and a fix is considered as important for Fedora
> users.

I'll experiment with the comment #45 suggestion today or tomorrow

Comment 67 Joachim Frieben 2017-05-25 16:02:31 UTC
Interestingly, I have encountered the network issue only when -both- of the host and the guest (Fedora) systems got installed using a live media instead of using the network install method. When the host system was installed using the network install method, the network issue did not show up later on.
Uninstalling all libvirt and dependent packages in the virtual guest indeed also restores network functionality because the conflicting virbr0 device is simply absent.

Comment 68 Adam Williamson 2017-05-25 17:26:51 UTC
Joachim: that's well known already; when using the installer the existing code to ensure no conflicts happen works fine. It's just tricky in the specific context of how a live image works.

Comment 69 Fedora Update System 2017-05-30 23:39:02 UTC
libvirt-3.2.1-2.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-1bfce27ae1

Comment 70 Cole Robinson 2017-05-31 18:19:15 UTC
The eventual solution was adding this to libvirtd.service [unit] section

ConditionKernelCommandLine=!rd.live.image

Comment 71 Adam Williamson 2017-05-31 18:48:20 UTC
So basically the thing I suggested a year ago? :)

https://bugzilla.redhat.com/show_bug.cgi?id=1146232#c45

note that I also suggested including an additional test that we're running in virt...

Comment 72 Cole Robinson 2017-05-31 21:11:38 UTC
(In reply to Adam Williamson from comment #71)
> So basically the thing I suggested a year ago? :)
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1146232#c45
> 
> note that I also suggested including an additional test that we're running
> in virt...

Yes, I explicitly referenced your comment in comment #66. I was just commenting here to indicate how the final result was different:

- s/rd.live/rd.live.image/
- corrected ! syntax
- dropped the ConditionVirtualization check

But the last bit was my mistake: I didn't notice the '|' in your comment so I couldn't figure out what you meant and thought you misread, but now that I actually check the docs it makes more sense :) Though the working syntax is different:

ConditionVirtualization=|0
ConditionKernelCommandLine=|!rd.live.image

so, start libvirtd 'if virt=0 OR not in the live environment'

I'll push another update

Comment 73 Fedora Update System 2017-05-31 22:17:56 UTC
libvirt-3.2.1-3.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-a7546a7e55

Comment 74 Fedora Update System 2017-06-01 03:20:03 UTC
libvirt-3.2.1-2.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-1bfce27ae1

Comment 75 Joachim Frieben 2017-06-01 04:03:51 UTC
(In reply to Cole Robinson from comment #70)
I am surprised because the network problem usually shows up after -installing- a new VM having used the live media. The VM then is no live system any longer.
For me, the network actually used to work very well for the original live session running as a virtual guest.

Comment 76 Fedora Update System 2017-06-04 05:11:14 UTC
libvirt-3.2.1-3.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-a7546a7e55

Comment 77 Laine Stump 2017-06-07 16:16:38 UTC
(In reply to Joachim Frieben from comment #75)
> I am surprised because the network problem usually shows up after
> -installing- a new VM having used the live media. The VM then is no live
> system any longer.

It's not a straightforward problem. When you "install" Fedora from a live image, the installer doesn't actually install the packages from their rpm's (which would end up running the specfile postinstall script that detects what network the system is connected to and accordingly chooses an unused network for libvirt's default network); instead, it copies over everything as it was installed on the live media. This means that the final install on the host will have its libvirt default network setup according to what networks were on use on the system that *created* the live media, not the network on the system that is running the live media.

Comment 78 Cole Robinson 2017-06-07 16:29:09 UTC
(In reply to Laine Stump from comment #77)
> (In reply to Joachim Frieben from comment #75)
> > I am surprised because the network problem usually shows up after
> > -installing- a new VM having used the live media. The VM then is no live
> > system any longer.
> 
> It's not a straightforward problem. When you "install" Fedora from a live
> image, the installer doesn't actually install the packages from their rpm's
> (which would end up running the specfile postinstall script that detects
> what network the system is connected to and accordingly chooses an unused
> network for libvirt's default network); instead, it copies over everything
> as it was installed on the live media. This means that the final install on
> the host will have its libvirt default network setup according to what
> networks were on use on the system that *created* the live media, not the
> network on the system that is running the live media.

yeah and unfortunately the solution I pushed doesn't help this problem at all, I overlooked it. the systemd bit just helps networking work when running the livecd in a VM. however we've now potentially punted the problem to first boot after a VM has been installed from the livecd... not sure how fedora QE feels about that. I don't think if this issue was nominated as a 'nice to fix' particuarly for the livecd impact or the post install impact, or both

Comment 79 Cole Robinson 2017-06-07 16:31:39 UTC
(In reply to Cole Robinson from comment #78)
> I don't think if this issue was nominated as a

* I don't know

Comment 80 Fedora Update System 2017-06-09 19:18:22 UTC
libvirt-3.2.1-3.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.

Comment 81 Kamil Páral 2017-09-11 11:09:30 UTC
Reopening once again and proposing as F27 Final Blocker. This issue is still present when you install F26 from Live and then run F27 as a VM.

Reproducer:
1. Install F26 from Workstation Live.
2. Fully update the system (so that it contains the fix from comment 80), reboot.
3. Install virt-manager, and run F27 Workstation Live in it.
4. Try to ping fedoraproject.org (or anything else) - see "name or service not known error".
5. Try to ping 8.8.8.8 - that works.
6. Compare "ip a" output on host and in VM - see that virbr0 have the same address in both systems  - 192.168.124.1/24.

libvirt-3.2.1-5.fc26
Fedora-Workstation-Live-x86_64-27-20170910.n.0.iso

Comment 82 Jan Kurik 2017-09-11 15:01:40 UTC
Removing the proposal for PrioritizedBug. We can set this flag on in case the bug is not accepted as a blocker.

Comment 83 Adam Williamson 2017-09-11 18:55:56 UTC
So couple of notes here: the attempted fix via systemd unit conditions never made it to F27+, it was applied to F26 after it branched and never applied to the master or f27 branches.

Also, I agree with Cole's #c78 that it doesn't really constitute a complete fix anyhow, on further thought.

I have re-opened and updated https://bugzilla.redhat.com/show_bug.cgi?id=1164492 to propose that standard workaround (drop the gnome-boxes libvirt dep for GA) as a blocker for F27; this bug is still for the underlying problem, and it shouldn't be closed because we implement the workaround again.

Comment 84 Kamil Páral 2017-09-11 18:58:51 UTC
Discussed during blocker review [1]:

#1146232 - RejectedBlocker (Final), #1164492 - AcceptedBlocker (Final) - we think the problem here constitutes a blocker, but as with previous releases, we will make #1164492 the blocker as we may fix it with a short-term workaround rather than a real fix (again)

[1] https://meetbot-raw.fedoraproject.org/fedora-blocker-review/2017-09-11/

Comment 85 Cole Robinson 2017-09-14 21:11:48 UTC
(In reply to Adam Williamson from comment #83)
> So couple of notes here: the attempted fix via systemd unit conditions never
> made it to F27+, it was applied to F26 after it branched and never applied
> to the master or f27 branches.
> 
> Also, I agree with Cole's #c78 that it doesn't really constitute a complete
> fix anyhow, on further thought.
> 

Yes and for that reason I'm not interested in reintroducing it for future fedora versions, it just punts the problem to post install time.

> I have re-opened and updated
> https://bugzilla.redhat.com/show_bug.cgi?id=1164492 to propose that standard
> workaround (drop the gnome-boxes libvirt dep for GA) as a blocker for F27;
> this bug is still for the underlying problem, and it shouldn't be closed
> because we implement the workaround again.

Thanks. again on the libvirt side we are back to needing some actual feature work to attempt to fix this.

Comment 86 Joachim Frieben 2017-09-30 07:20:15 UTC
Using a Fedora 26 host installed through the network-install method and running a Fedora 26 virtual guest installed from a recent workstation live image, network device virbr0 appears randomly or not. The virtual machine sometimes has to be restarted several times before connectivity is eventually established.
In the first case (devices ens3, lo, virbr0), network is not working, in the seconds case (devices ens3, lo) it is.

Comment 87 Paul W. Frields 2017-10-19 18:03:39 UTC
Now that we're in Final freeze, it looks like we need this workaround again, sigh.

Comment 88 Cole Robinson 2017-10-19 20:56:41 UTC
(In reply to Paul W. Frields from comment #87)
> Now that we're in Final freeze, it looks like we need this workaround again,
> sigh.

The blocker bug is the gnome-boxes bz since it's the only complete workaround we have so far: https://bugzilla.redhat.com/show_bug.cgi?id=1164492

Comment 89 Chris Murphy 2018-03-23 02:47:11 UTC
I'm running into this bug with Fedora-Workstation-Live-x86_64-28-20180321.n.0.iso. After some undisclosed amount of time, I lose the network, and the only way I get it back in the VM is to systemctl stop libvirtd and then restart NetworkManager. It's super annoying having to do this every damn release...

Comment 90 Adam Williamson 2018-03-23 05:51:21 UTC
1164492 again, then.

Comment 91 Ben Cotton 2018-11-27 16:30:35 UTC
This message is a reminder that Fedora 27 is nearing its end of life.
On 2018-Nov-30  Fedora will stop maintaining and issuing updates for
Fedora 27. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora  'version' of '27'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 27 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 92 Ben Cotton 2018-11-30 20:34:14 UTC
Fedora 27 changed to end-of-life (EOL) status on 2018-11-30. Fedora 27 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.