1191665 – Nested KVM with AMD: L2 (nested guest) fails with "divide error: 0000 [#1] SMP"

Bug 1191665 - Nested KVM with AMD: L2 (nested guest) fails with "divide error: 0000 [#1] SMP"

Summary: Nested KVM with AMD: L2 (nested guest) fails with "divide error: 0000 [#1] SMP"

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	21
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Kernel Maintainer List
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-02-11 17:19 UTC by Kashyap Chamarthy
Modified:	2015-12-02 17:24 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2015-12-02 08:55:06 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Kashyap Chamarthy 2015-02-11 17:19:19 UTC

Description of problem
----------------------

Booting a minimal Fedora 21 guest with 3.17.4-301.fc21.x86_64 Kernel
fails with 

    [  192.278032] divide error: 0000 [#1] SMP

and a call trace. (More details in 'Actual results' section.)


Versions
--------

On physical host (L0):

    $ uname -r; rpm -q libvirt-daemon-kvm qemu-system-x86
    3.18.5-201.fc21.x86_64
    libvirt-daemon-kvm-1.2.12-2.fc22.x86_64
    qemu-system-x86-2.2.0-5.fc22.x86_64

On guest hypervisor (L1):

    $ uname -r; rpm -q qemu-system-x86
    3.17.4-301.fc21.x86_64
    qemu-system-x86-2.1.3-1.fc21.x86_64


How reproducible: Consistently.


Steps to reproduce
------------------

(1) Ensure nested virtualiazation for AMD is enabled on the physical host
(L0):

    $ cat /sys/module/kvm_amd/parameters/nested
    0
    $ sudo rmmod kvm-amd
    $ sudo sh -c "echo 'options amd nested=1' >> /etc/modprobe.d/dist.conf"
    $ sudo modprobe kvm-amd
    $ cat /sys/module/kvm_amd/parameters/nested
    1
    $ modinfo kvm_amd | grep -i nested
    parm:           nested:int

(2) Create a minimal Fedora 21 guest (L1) or guest hypervisor)

    $ virt-builder fedora-21 -o f21vm --format qcow2 \
        --update --selinux-relabel --size 30G

(3) Import the disk image into libvirt:

    $ chmod go+rx $HOME
    $ virt-install --name f21vm --ram 3072 \
        --disk path=/home/kashyapc/f21vm,format=qcow2,cache=writeback \
        --nographics --import --os-variant fedora21

(4) Expose virtualization extensions to L1 (guest hypervisor)

    $ virt-xml f21vm --edit \
        --cpu host-passthrough,clearxml=yes

(5) Inside L1, boot a nested KVM guest (L2) . Instead of a full blown
guest, let's use `qemu-sanity-check` with KVM:

    $ qemu-sanity-check --accel=kvm


Which gives you this CLI (run from a different shell), that confirms
that the L2 guest is indeed running on KVM (and not TCG):

  $ ps -ef | grep -i qemu
  root       763   762 35 11:49 ttyS0    00:00:00 qemu-system-x86_64 -nographic -nodefconfig -nodefaults -machine accel=kvm -no-reboot -serial file:/tmp/tmp.rl3naPaCkZ.out -kernel /boot/vmlinuz-3.17.4-301.fc21.x86_64 -initrd /usr/lib64/qemu-sanity-check/initrd -append console=ttyS0 oops=panic panic=-1
  

Actual results
--------------


$ qemu-sanity-check --accel=kvm
[  191.164902] kvm: zapping shadow pages for mmio generation wraparound
[  192.278032] divide error: 0000 [#1] SMP 
[  192.278032] Modules linked in: ip6t_rpfilter ip6t_REJECT xt_conntrack cfg80211 rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_$efrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle $ptable_security iptable_raw kvm_amd kvm ppdev serio_raw virtio_balloon virtio_net virtio_console parport_pc i2c_piix4 parport pvpanic virtio_blk virtio_pci ata_generic virtio_ring virtio pa$a_acpi
[  192.278032] CPU: 0 PID: 764 Comm: qemu-system-x86 Not tainted 3.17.4-301.fc21.x86_64 #1
[  192.278032] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153950- 04/01/2014
[  192.278032] task: ffff8800ba7e1d70 ti: ffff880036b9c000 task.ti: ffff880036b9c000
[  192.278032] RIP: 0010:[<ffffffffa010384d>]  [<ffffffffa010384d>] svm_handle_external_intr+0xd/0x20 [kvm_amd]
[  192.278032] RSP: 0018:ffff880036b9fd78  EFLAGS: 00000292
[  192.278032] RAX: ffffffffa010f040 RBX: ffff880036ba0000 RCX: 0000000100000000
[  192.278032] RDX: 0000000100000000 RSI: 0000000002446e9a RDI: ffff880036ba0000
[  192.278032] RBP: ffff880036b9fd78 R08: 0000000000000000 R09: 0000000000000000
[  192.278032] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880036b94040
[  192.278032] R13: ffffffffa01033e0 R14: 0000000000000000 R15: 0000000000000000
[  192.278032] FS:  0000000000000000(0000) GS:ffff8800bfa00000(0000) knlGS:0000000000000000
[  192.278032] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  192.278032] CR2: 0000000000000000 CR3: 0000000036b73000 CR4: 00000000000006f0
[  192.278032] Stack:
[  192.278032]  ffff880036b9fe38 ffffffffa008c4df 0000000000000000 0000000000000000
[  192.278032]  ffff8800ba7e1d70 ffff8800ba7e1d70 ffff8800ba7e1d70 ffff880036b9ffd8
[  192.278032]  ffffffffa0103b60 fffffffe7ffbfeff 00000000ccf499b7 ffff880036ba0000
[  192.278032] Call Trace:
[  192.278032]  [<ffffffffa008c4df>] kvm_arch_vcpu_ioctl_run+0x3df/0x1260 [kvm]
[  192.278032]  [<ffffffffa0103b60>] ? svm_vcpu_load+0x80/0x120 [kvm_amd]
[  192.278032]  [<ffffffffa0088884>] ? kvm_arch_vcpu_load+0x54/0x210 [kvm]
[  192.278032]  [<ffffffffa00761ec>] kvm_vcpu_ioctl+0x31c/0x5b0 [kvm]
[  192.278032]  [<ffffffff810cda0b>] ? put_prev_entity+0x5b/0x400
[  192.278032]  [<ffffffff810c861f>] ? set_next_entity+0x5f/0x80
[  192.278032]  [<ffffffff810d0f29>] ? pick_next_task_fair+0x6c9/0x8c0
[  192.278032]  [<ffffffff81220910>] do_vfs_ioctl+0x2d0/0x4b0
[  192.278032]  [<ffffffff81220b71>] SyS_ioctl+0x81/0xa0
[  192.278032]  [<ffffffff81746ae9>] system_call_fastpath+0x16/0x1b
[  192.278032] Code: 66 66 90 55 31 c0 48 89 e5 5d c3 0f 1f 00 66 66 66 66 90 55 b8 01 00 00 00 48 89 e5 5d c3 66 66 66 66 90 55 48 89 e5 fb 66 66 90 <66> 66 90 5d c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 
[  192.278032] RIP  [<ffffffffa010384d>] svm_handle_external_intr+0xd/0x20 [kvm_amd]
[  192.278032]  RSP <ffff880036b9fd78>
[  192.278032] ---[ end trace ff0b9bddef0a0dbf ]---



Expected results
----------------

Nested guest (L2) should boot successfully and the guest hypervisor (L1)
should *not* hang.

Additional info
---------------

Even with TCG (plain QEMU emulation), L1 guest hangs when you invoke
`qemu-sanity-check` *without* the  --accel=kvm parameter.
[kashyapc@tesla bugs]$

Comment 1 Josh Boyer 2015-02-11 17:43:25 UTC

Fedora is already past the 3.17.y series of kernels.  Does this happen with a 3.18.y kernel on the guest?

Comment 2 Kashyap Chamarthy 2015-02-11 18:08:47 UTC

I tested from Rawhide, this is still reproducible with a slight variation.

Tested with kernel-3.19.0-1.fc22) and QEMU (qemu-2.2.0-5.fc22) from Rawhide
on L0 & L1.

Results:

  (a) I don't notice a stack trace on L2 boot

  (b) L1 (guest hypervisor) still completely hangs - unresponsive.

  (c) On L0, I notice a ton of these messages:

        skip_emulated_instruction: ip 0xffec next 0xffffffff8105e964


Along with a few of these (which I assume are benign):

[. . .]
Feb 11 12:52:54 hp-dl585g2-01.foo.bar.com NetworkManager[593]: <info> (virbr0): link connected
Feb 11 12:52:57 hp-dl585g2-01.foo.bar.com kernel: kvm [1396]: vcpu0 unhandled rdmsr: 0xc0010112
Feb 11 12:52:58 hp-dl585g2-01.foo.bar.com kernel: kvm [1396]: vcpu0 unimplemented perfctr wrmsr: 0xc0010004 data 0xffff
Feb 11 12:53:07 hp-dl585g2-01.foo.bar.com kernel: kvm [1396]: vcpu0 unhandled rdmsr: 0x3a
Feb 11 12:53:07 hp-dl585g2-01.foo.bar.com kernel: kvm [1396]: vcpu0 unhandled rdmsr: 0xd90
[. . .]


I can get `dmesg`, `dmidecode` , `x86info -a` on L0 and L1 if needed.

Comment 3 Marcelo Tosatti 2015-02-24 16:40:53 UTC

(In reply to Kashyap Chamarthy from comment #2)
> I tested from Rawhide, this is still reproducible with a slight variation.
> 
> Tested with kernel-3.19.0-1.fc22) and QEMU (qemu-2.2.0-5.fc22) from Rawhide
> on L0 & L1.
> 
> Results:
> 
>   (a) I don't notice a stack trace on L2 boot
> 
>   (b) L1 (guest hypervisor) still completely hangs - unresponsive.
> 
>   (c) On L0, I notice a ton of these messages:
> 
>         skip_emulated_instruction: ip 0xffec next 0xffffffff8105e964
> 
> 
> Along with a few of these (which I assume are benign):
> 
> [. . .]
> Feb 11 12:52:54 hp-dl585g2-01.foo.bar.com NetworkManager[593]: <info>
> (virbr0): link connected
> Feb 11 12:52:57 hp-dl585g2-01.foo.bar.com kernel: kvm [1396]: vcpu0
> unhandled rdmsr: 0xc0010112
> Feb 11 12:52:58 hp-dl585g2-01.foo.bar.com kernel: kvm [1396]: vcpu0
> unimplemented perfctr wrmsr: 0xc0010004 data 0xffff
> Feb 11 12:53:07 hp-dl585g2-01.foo.bar.com kernel: kvm [1396]: vcpu0
> unhandled rdmsr: 0x3a
> Feb 11 12:53:07 hp-dl585g2-01.foo.bar.com kernel: kvm [1396]: vcpu0
> unhandled rdmsr: 0xd90
> [. . .]
> 
> 
> I can get `dmesg`, `dmidecode` , `x86info -a` on L0 and L1 if needed.

Kashyap, 

Can you execute, in the L0 host, the following command to print the instruction 
of the L1 guest at 0xffec:

virsh qemu-monitor-command guest-name --hmp "x /20i 0xffec"

Also if you can collect a trace with the following tracepoints in the L0 host:

echo 90000 > /sys/kernel/debug/tracing/buffer_size_kb
echo "kvm_nested_vmexit kvm_nested_vmrun kvm_exit kvm_entry" > /sys/kernel/debug/tracing/set_event

(kill the L1 guest as soon as you see a skip_emulated_instruction message
in the L0 host dmesg).

Comment 4 Kashyap Chamarthy 2015-02-25 13:06:19 UTC

Hi Marcelo,

The same AMD test machine where I was seeing the above behavior is no longer available unfortunately. 

However, I reserved a new AMD machine and tested with Kernel-4.0-rc1 and it seems to "fix itself": 

  - Booting L2 (`qemu-sanity-check --accel=kvm`) works does NOT result
    in L1 hang or being unresponsive, i.e. L2 boots successfully.
  - I don't notice any "skip_emulated_instruction" messages on L0's `dmesg`


Version info on L0
------------------

  $ uname -r; rpm -q qemu-system-x86 libvirt-daemon-kvm
  4.0.0-0.rc1.git0.1.fc23.x86_64
  qemu-system-x86-2.2.0-7.fc23.x86_64
  libvirt-daemon-kvm-1.2.12-2.fc22.x86_64

Version info on L1
------------------

  $ uname -r; rpm -q qemu-system-x86 libvirt-daemon-kvm
  4.0.0-0.rc1.git1.1.fc23.x86_64
  qemu-system-x86-2.2.0-7.fc23.x86_64
  package libvirt-daemon-kvm is not installed



And, the `qemu-monitor-command` you requested for L1:

  $ virsh qemu-monitor-command f21vm --hmp "x /20i 0xffec"
  0x000000000000ffec:  (bad)
  0x000000000000ffed:  ss
  0x000000000000ffee:  (bad)
  0x000000000000ffef:  ss
  0x000000000000fff0:  (bad)
  0x000000000000fff1:  ss
  0x000000000000fff2:  (bad)
  0x000000000000fff3:  ss
  0x000000000000fff4:  (bad)
  0x000000000000fff5:  ss
  0x000000000000fff6:  (bad)
  0x000000000000fff7:  ss
  0x000000000000fff8:  (bad)
  0x000000000000fff9:  ss
  0x000000000000fffa:  (bad)
  0x000000000000fffb:  ss
  0x000000000000fffc:  (bad)
  0x000000000000fffd:  ss
  0x000000000000fffe:  (bad)
  0x000000000000ffff:  ss


And, I don't think the traces would be useful since I don't see any "skip_emulated_instruction" messages.


Next, I'm going to test with *exact* Kernel and QEMU version mentioned in comment #2 on L0 & L1: kernel-3.19.0-1.fc22) and QEMU (qemu-2.2.0-5.fc22).

Comment 5 Kashyap Chamarthy 2015-02-25 13:42:13 UTC

(In reply to Kashyap Chamarthy from comment #4)

[. . .]

> Next, I'm going to test with *exact* Kernel and QEMU version mentioned in
> comment #2 on L0 & L1: kernel-3.19.0-1.fc22) and QEMU (qemu-2.2.0-5.fc22).

Hmm, here too, I can't reproduce the issue I described in comment #2 with the above Kernel and QEMU (exact version is qemu-system-x86-2.2.0-7, but that does not matter in this case).

I'll see if I can get access to the problem machine in some weeks to reproduce this issue.

Comment 6 Fedora Kernel Team 2015-04-28 18:35:23 UTC

*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 21 kernel bugs.

Fedora 21 has now been rebased to 3.19.5-200.fc21.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 22, and are still experiencing this issue, please change the version to Fedora 22.

If you experience different issues, please open a new bug report for those.

Comment 7 Richard W.M. Jones 2015-04-28 21:47:14 UTC

Genuine question: Should we bother with filing kernel bugs at all?

Comment 8 Josh Boyer 2015-04-28 21:51:34 UTC

(In reply to Richard W.M. Jones from comment #7)
> Genuine question: Should we bother with filing kernel bugs at all?

Yes.  Why wouldn't you?  If you want to report directly upstream, that would be great too.

Marcelo is on CC and was responding at one point.  He's in a much better position to actually investigate this bug than anyone else.  Nested KVM isn't a high priority for the Fedora kernel team, but that doesn't mean the bug is not valuable for the KVM people.

Comment 9 Bandan Das 2015-06-18 22:37:41 UTC

If you get a chance, please try the patch at https://lkml.org/lkml/2015/6/11/46 which I believe should fix this.

Comment 10 Fedora End Of Life 2015-11-04 11:24:19 UTC

This message is a reminder that Fedora 21 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 21. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '21'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 21 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 11 Fedora End Of Life 2015-12-02 08:55:14 UTC

Fedora 21 changed to end-of-life (EOL) status on 2015-12-01. Fedora 21 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.