Bug 1633058 - qemu/kvm broken in kernel-4.18.8-200.fc28.x86_64
Summary: qemu/kvm broken in kernel-4.18.8-200.fc28.x86_64
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 28
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-09-26 06:39 UTC by Villy Kruse
Modified: 2018-10-18 14:08 UTC (History)
26 users (show)

Fixed In Version: kernel-4.18.14-200.fc28
Clone Of:
Environment:
Last Closed: 2018-10-18 14:08:36 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
This is what I saw when booting fc29 as guest os. Nothing further was hapening for half an hour. (238.54 KB, image/jpeg)
2018-09-26 06:39 UTC, Villy Kruse
no flags Details
/var/log/messages from CentOS-4 system (62.01 KB, text/plain)
2018-09-26 06:59 UTC, Villy Kruse
no flags Details
Boot attempt of an RH73 client (Not RHEL, the old one from before fedora). (349.46 KB, image/jpeg)
2018-09-26 13:25 UTC, Villy Kruse
no flags Details
lshw host system. (16.47 KB, text/plain)
2018-09-27 14:03 UTC, Villy Kruse
no flags Details
libvirt/qemu configuration files, gzipped tar file. (2.32 KB, application/x-gzip)
2018-09-27 14:04 UTC, Villy Kruse
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Linux Kernel 201373 0 None None None 2019-04-03 10:22:35 UTC

Description Villy Kruse 2018-09-26 06:39:39 UTC
Created attachment 1487026 [details]
This is what I saw when booting fc29 as guest os.  Nothing further was hapening for half an hour.

Description of problem:


Version-Release number of selected component (if applicable):

Since kernel-4.18.8-200.fc28.x86_64 I am unable to boot a guest os in qemu/kvm.  I tried with various versions of linux.

How reproducible:

Always, however, I was able to boot a Centos4 system, but it reported some IRQ#11 problem in /var/log/messages causing the network interface to not working.  The suggested fix to provide "acpi=off" did not help.

The problem also occurs in Fedora29 after the latest update.

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Villy Kruse 2018-09-26 06:59:16 UTC
Created attachment 1487034 [details]
/var/log/messages from CentOS-4 system

Comment 2 Laura Abbott 2018-09-26 08:54:17 UTC
Does it boot if you bang on the keyboard? One ongoing problem is blocking on entropy. There are patches in the kernel to work around this but it still requires correct settings in qemu.

Comment 3 Villy Kruse 2018-09-26 13:23:35 UTC
(In reply to Laura Abbott from comment #2)
> Does it boot if you bang on the keyboard? One ongoing problem is blocking on
> entropy. There are patches in the kernel to work around this but it still
> requires correct settings in qemu.

No difference.

It did work perfectly in previous version kernel-4.18.7-200.fc28.x86_64.

Comment 4 Villy Kruse 2018-09-26 13:25:05 UTC
Created attachment 1487241 [details]
Boot attempt of an RH73 client (Not RHEL, the old one from before fedora).

Comment 5 Joachim Frieben 2018-09-26 16:53:59 UTC
Booting various Fedora live images (27, 28, 29) in gnome-boxes fails since kernel-4.18.8-200.fc28 running on the virtual host. An installed Fedora 28 guest does not reach the graphical login when the host kernel is kernel-4.18.7-200.fc28 or later. After downgrading the host kernel e.g. to kernel-4.18.5-200.fc28, the virtual Fedora 28 guest boots again as expected. The same issue is observed when running the latest Ubuntu 18.10 live image. Changing the virtual video device of an already installed virtual guest to QXL makes no difference. Likewise, it does not make a difference whether a GNOME on Wayland or a GNOME on X session is run on the virtual guest.
Currently, it seems to be impossible to run a (recent) Fedora virtual guest on a Fedora 28 virtual host when using gnome-boxes. Since kernel-4.18.8-200.fc28 contains important security fixes, using an older kernel is no option.

Comment 6 Laura Abbott 2018-09-27 04:47:14 UTC
There were several KVM fixes that came into 4.18.9. Can you try that or 4.18.10 which is going to be filed in bodhi?

Comment 7 Villy Kruse 2018-09-27 05:23:51 UTC
(In reply to Laura Abbott from comment #6)
> There were several KVM fixes that came into 4.18.9. Can you try that or
> 4.18.10 which is going to be filed in bodhi?

Already running 4.18.9 for several days and it still has the problem.

4.18.7  works OK
4.18.8  broken
4.18.9  broken
4.18.10 unavailable

Comment 8 Justin M. Forbes 2018-09-27 13:12:51 UTC
Can you give some more information about your setup. How much memory are you giving the guest, what video is in use (boxes defaults to spice). What is the host graphics? Can you run the guest outside of boxes? I am having difficulty reproducing here with 4.18.9 as the host kernel.

Comment 9 Villy Kruse 2018-09-27 14:03:38 UTC
Created attachment 1487784 [details]
lshw host system.

Comment 10 Villy Kruse 2018-09-27 14:04:45 UTC
Created attachment 1487793 [details]
libvirt/qemu configuration files, gzipped tar file.

Comment 11 Villy Kruse 2018-09-27 14:10:30 UTC
(In reply to Justin M. Forbes from comment #8)
> Can you give some more information about your setup. How much memory are you
> giving the guest, what video is in use (boxes defaults to spice). What is
> the host graphics? Can you run the guest outside of boxes? I am having
> difficulty reproducing here with 4.18.9 as the host kernel.

What do you mean "run the guest outside of boxes"?  By the way I use XFCE.

Comment 12 Joachim Frieben 2018-09-27 14:25:46 UTC
No improvement for these kernel packages running on the host:
- kernel-4.18.10-200.fc28
- kernel-4.19.0-0.rc5.git0.1.fc30

Comment 13 Justin M. Forbes 2018-09-27 14:57:52 UTC
(In reply to Villy Kruse from comment #11)

> What do you mean "run the guest outside of boxes"?  By the way I use XFCE.

Sorry, that was a different comment using boxes. So, looking at your config files, it seems that these are all 32bit guests failing? Have you tried a 64bit guest just to see if it is something limited to running 32bit guests?

Comment 14 Villy Kruse 2018-09-27 15:15:19 UTC
(In reply to Justin M. Forbes from comment #13)
> (In reply to Villy Kruse from comment #11)
> 
> > What do you mean "run the guest outside of boxes"?  By the way I use XFCE.
> 
> Sorry, that was a different comment using boxes. So, looking at your config
> files, it seems that these are all 32bit guests failing? Have you tried a
> 64bit guest just to see if it is something limited to running 32bit guests?

No.  Both 32bit and 64bit guests.  The "experiment" is using TianoCore aka edk2 for uefi boot.  That one doesn't even come as far as the boot prompt.

Comment 15 Justin M. Forbes 2018-09-27 15:37:50 UTC
(In reply to Joachim Frieben from comment #12)
> No improvement for these kernel packages running on the host:
> - kernel-4.18.10-200.fc28
> - kernel-4.19.0-0.rc5.git0.1.fc30

Can you attach your system config as well? I am trying to find some sort of common ground as to what is happening here.

Comment 16 Villy Kruse 2018-09-27 17:46:57 UTC
(In reply to Villy Kruse from comment #14)
> (In reply to Justin M. Forbes from comment #13)
> > (In reply to Villy Kruse from comment #11)
> > 
> > > What do you mean "run the guest outside of boxes"?  By the way I use XFCE.
> > 
> > Sorry, that was a different comment using boxes. So, looking at your config
> > files, it seems that these are all 32bit guests failing? Have you tried a
> > 64bit guest just to see if it is something limited to running 32bit guests?
> 
> No.  Both 32bit and 64bit guests.  The "experiment" is using TianoCore aka
> edk2 for uefi boot.  That one doesn't even come as far as the boot prompt.

Just to double check, booted kernel-4.18.7 and created a new virtual machine with xfce-28 64bit.  It fails to boot in 4.18.10, which I meanwhile found in your build system.  Actually, it boots, but gets stuck in initrd.

Comment 17 Joachim Frieben 2018-09-28 05:45:34 UTC
(In reply to Justin M. Forbes from comment #15)
I am running various Fedora/Mint/Ubuntu live images with the standard configuration provided by gnome-boxes/qemu. The default video device is virtio-vga which excludes the possibility of adding "nomodeset" to the virtual guest's kernel options. Host and guest systems are of type x86_64. There seem to have been bad changes introduced into kernel 4.18.8 compared to 4.18.7 related to KVM.

Comment 18 Justin M. Forbes 2018-09-28 14:27:18 UTC
(In reply to Joachim Frieben from comment #17)
> (In reply to Justin M. Forbes from comment #15)
> I am running various Fedora/Mint/Ubuntu live images with the standard
> configuration provided by gnome-boxes/qemu. The default video device is
> virtio-vga which excludes the possibility of adding "nomodeset" to the
> virtual guest's kernel options. Host and guest systems are of type x86_64.
> There seem to have been bad changes introduced into kernel 4.18.8 compared
> to 4.18.7 related to KVM.

Right, but what CPU model are you using, and what video on the host? I am trying to see if there is a correlation of CPU features that are problematic, as I cannot reproduce on modern i7 CPUs.

Comment 19 Joachim Frieben 2018-09-28 14:54:11 UTC
(In reply to Justin M. Forbes from comment #18)
The host system is a Lenovo ThinkPad T400 with an Intel P8400 Core 2 Duo CPU, 8 GB of system memory, and an AMD HD 3470 video device.

Comment 20 Justin M. Forbes 2018-09-28 17:28:55 UTC
Interesting that they are both Intel Wolfdale CPUs, and fairly old. I am wondering if Intel hasn't updated the microcode for those yet to deal with the new Spectre/Meltdown issues, and the kvm changes for L1TD are causing the  issue here

Comment 21 Joachim Frieben 2018-09-28 17:49:26 UTC
(In reply to Justin M. Forbes from comment #20)
It is correct that the Linux kernel can use the recent microcode extensions for Core i processors released by Intel in order to mitigate recently discovered vulnerabilities but as far as I remember these extensions are optional but by no means mandatory. Maybe the relevant changes do assume recent CPU features which would be a bug.
In the output of 'dmesg', a number of virtio devices such as virtio_blk, virtio_console, or virtio_net return error -2 upon initialization.
Furthermore, virtio_gpu produces another error message, namely:

  "[drm: virtio_gpu_driver_load.cold5 [virtio_gpu]] *ERROR* failed to find virt queues"

Comment 22 Erik Fjeldstrom 2018-10-01 21:52:11 UTC
I was experiencing this issue (Windows 2008 guest running on HP Compaq dc5800, qemu/kvm) and came across this Linux mailing list report: "Core2 issue with 4.18.8+ kernel" (https://www.spinics.net/lists/kvm/msg175634.html). Applying the patch suggested "[v2] KVM: x86: fix L1TF's MMIO GFN calculation" (https://patchwork.kernel.org/patch/10614795/), which is queued for inclusion, solved the problem for me.

Comment 23 Laura Abbott 2018-10-01 23:03:44 UTC
Thanks for the pointer. There's some fuzz when applying the patch so I'm a little wary of picking it up as is just in case there's more work needed (I'm not eager to cause more regressions). If someone from the KVM side can confirm it's okay for applying to 4.18.x, I'll bring it in otherwise we can see if it gets picked up with the next stable batch.

Comment 24 Joachim Frieben 2018-10-04 17:46:33 UTC
(In reply to Laura Abbott from comment #23)
It seems that the fix was not included in 4.18.12, and indeed, kernel-4.18.12-200.fc28 does not show any improvement. A patched Fedora kernel build would be helpful to verify that the patch is working properly.

Comment 25 Villy Kruse 2018-10-05 14:25:13 UTC
(In reply to Joachim Frieben from comment #24)
> (In reply to Laura Abbott from comment #23)
> It seems that the fix was not included in 4.18.12, and indeed,
> kernel-4.18.12-200.fc28 does not show any improvement. A patched Fedora
> kernel build would be helpful to verify that the patch is working properly.

I would say that as well.

Comment 26 pzeppegno 2018-10-10 12:34:49 UTC
(In reply to Villy Kruse from comment #25)
> (In reply to Joachim Frieben from comment #24)
> > (In reply to Laura Abbott from comment #23)
> > It seems that the fix was not included in 4.18.12, and indeed,
> > kernel-4.18.12-200.fc28 does not show any improvement. A patched Fedora
> > kernel build would be helpful to verify that the patch is working properly.
> 
> I would say that as well.

Same for me.

Comment 27 Erik Fjeldstrom 2018-10-10 19:58:50 UTC
The patch appears to have been added to 4.19-rc7 (https://lwn.net/Articles/767784/ and https://lkml.org/lkml/2018/10/5/293), so my guess is that it won't be included in the 4.18.x series.

Comment 28 Laura Abbott 2018-10-11 19:18:52 UTC
The patch is queued for the 4.18.14 stable release.

Comment 29 Villy Kruse 2018-10-11 20:26:27 UTC
Strangely enough, Oracle VirtualBox is not affected by this problem.

Comment 30 Joachim Frieben 2018-10-12 05:30:33 UTC
(In reply to Villy Kruse from comment #29)
I would say it just does not use KVM. Oracle VM VirtualBox exists on various platforms and comes with its own kernel module just like there used to exist a KQEMU kernel module which could be used on systems even without hardware virtualization extensions.

Comment 31 j8takagi 2018-10-16 02:26:31 UTC
I am unable to boot a guest os in qemu/kvm too, since kernel-4.18.8 on arch linux host. when boot is failed, the message below is displayed.

 virtio_blk virtio1: device uses modern interface but not have VIRTIO_F_VERSION_1

And when kernel-lts-4.14.72, I am unable to boot, too. I can boot when kernel-4.18.7 or kernel-lts-4.14.69.

Comment 32 j8takagi 2018-10-16 16:34:46 UTC
I can boot guest os in qemu/kvm on kernel-4.18.14 and kernel-4.14.76-1-lts. thanks you, very much.

Comment 33 Villy Kruse 2018-10-18 09:42:44 UTC
kernel-4.18.14 indeed fixed the problem.  I can boot 32bit and 64bin linux clients without problems.  By the way, why is status still "New"?


Note You need to log in before you can comment on or make changes to this bug.