RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1738741 - L2 guest hit kernel panic when do L1->L1 live migration on PML-enabled intel host
Summary: L2 guest hit kernel panic when do L1->L1 live migration on PML-enabled intel ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: kernel
Version: 8.1
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: rc
: 8.2
Assignee: Paolo Bonzini
QA Contact: Qinghua Cheng
Parth Shah
URL:
Whiteboard:
: 1076294 1745449 (view as bug list)
Depends On: 1749495
Blocks: 1558351 1745449 1746622
TreeView+ depends on / blocked
 
Reported: 2019-08-08 03:30 UTC by Li Xiaohui
Modified: 2023-10-06 18:28 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 1745449 (view as bug list)
Environment:
Last Closed: 2020-04-28 16:23:27 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
vmcore-dmesg.txt (45.02 KB, text/plain)
2019-08-08 03:30 UTC, Li Xiaohui
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2020:1769 0 None None None 2020-04-28 16:24:36 UTC

Description Li Xiaohui 2019-08-08 03:30:05 UTC
Description of problem:
L2 guest hit kernel panic when do L1->L1 local migration


Version-Release number of selected component (if applicable):
host info: kernel-4.18.0-128.el8.x86_64 & qemu-img-2.12.0-83.module+el8.1.0+3852+0ba8aef0.x86_64


How reproducible:
3/3


Steps to Reproduce:
1.ensure nested and ept enabled on intel host
2.boot a guest with "-cpu host" on host
3.in L1 guest, boot a L2 guest with "-cpu $Module_name"
4.with L2 guest running in L1, do L1->L1 local live migration


Actual results:
after migration, L2 guest hit kernel panic in L1(L1 still works well), please see attachment for vmcore-dmesg.txt


Expected results:
L1&L2 guest work well after migration.


Additional info:

Comment 1 Li Xiaohui 2019-08-08 03:30:31 UTC
Created attachment 1601677 [details]
vmcore-dmesg.txt

Comment 5 Paolo Bonzini 2019-09-23 22:45:34 UTC
xiaohli, can you test with kvm_intel.pml=0 (module parameter)?

Rick, if the test works, it shouldn't be a difficult patch.

Comment 6 Li Xiaohui 2019-09-24 04:51:21 UTC
(In reply to Paolo Bonzini from comment #5)
> xiaohli, can you test with kvm_intel.pml=0 (module parameter)?
>
You're right, when set kvm_intel.pml=0 on host, then do L1-L1 migration between one hosts(or using two hosts), migration finish successfully, L1&L2 all work well, no kernel panic happen

> Rick, if the test works, it shouldn't be a difficult patch.

Comment 8 Paolo Bonzini 2019-09-27 12:46:12 UTC
*** Bug 1745449 has been marked as a duplicate of this bug. ***

Comment 9 Rick Barry 2019-10-16 14:20:06 UTC
The patches for this have been included in the KVM rebase for 8.2.

Closing this as duplicate of bug 1749495 and requesting 8.1 z-stream. Vitaly can provide info on the patches needed.

QA can you provide qa_ack that's needed for the z-stream approval?

*** This bug has been marked as a duplicate of bug 1749495 ***

Comment 10 Vitaly Kuznetsov 2019-10-16 15:46:35 UTC
(In reply to Rick Barry from comment #9)

> Vitaly can provide info on the patches needed.
> 

8.2 rebase patches are not merged yet so commit hashes come from my local git:

6685949407c0 selftests: kvm: add test for dirty logging inside nested guests
549f1da01d37 KVM: x86: fix nested guest live migration with PML
a95566678f33 KVM: x86: assign two bits to track SPTE kinds

Comment 11 Karen Noel 2019-10-17 01:50:48 UTC
(In reply to Rick Barry from comment #9)
> The patches for this have been included in the KVM rebase for 8.2.
> 
> Closing this as duplicate of bug 1749495 and requesting 8.1 z-stream. Vitaly
> can provide info on the patches needed.
> 
> QA can you provide qa_ack that's needed for the z-stream approval?
> 
> *** This bug has been marked as a duplicate of bug 1749495 ***

Use depends on and TestOnly instead of closing. This way we can track the fix and testing. Thanks.

Comment 12 Rick Barry 2019-11-13 14:00:45 UTC
Moving this to ON_QA since dependent bug 1749495 is also ON_QA.

Comment 13 Paolo Bonzini 2019-12-13 13:36:12 UTC
*** Bug 1076294 has been marked as a duplicate of this bug. ***

Comment 15 Paolo Bonzini 2019-12-17 12:11:32 UTC
This bug has been fixed now, but actually it was both discovered and fixed during 8.2 development; it was never something supported in 8.1.  So I think it does not need any documentation.

Comment 17 Qinghua Cheng 2020-01-02 08:55:12 UTC
Hi Paolo,

When I tried to verify this bug, I got qemu-kvm core dump on host. 

QEMU 4.2.0 monitor - type 'help' for more information
(qemu) qemu-kvm: error: failed to set MSR 0x48e to 0xfff9fffe04006172
qemu-kvm: /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2947: kvm_put_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
./test_linux2_8_2_incoming.sh: line 21: 25581 Aborted                 (core dumped) /usr/libexec/qemu-kvm -name 'l1-rhel8' -machine q35 -nodefaults -device VGA,bus=pcie.0,addr=0x2 -device qemu-xhci,id=usb1,bus=pcie.0,addr=0x3 -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/root/rhel820-64-virtio-scsi.qcow2 -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,serial=SYSTEM_DISK0,bus=pcie.0,addr=0x4 -device virtio-net-pci,mac=9a:ff:43:39:cd:8b,id=idO4Skpc,netdev=idpmLptO,bus=pcie.0,addr=0x5 -netdev tap,id=idpmLptO,vhost=on -m 8192 -smp 12,maxcpus=12,cores=6,threads=1,sockets=2 -cpu host -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 -vnc :2 -rtc base=localtime,clock=host,driftfix=slew -boot order=cdn,once=d,menu=off,strict=off -enable-kvm -monitor stdio -qmp tcp:0:6667,server,nowait -incoming tcp:0:5555


My host env: 

kernel: 4.18.0-167.el8.x86_64
qemu-kvm: qemu-kvm-4.2.0-4.module+el8.2.0+5220+e82621dc.x86_64

This issue happens both when set pml to Y and N on my host. 

L1 and L2 guests are both RHEL 8.2

Is this another issue? If need to open a new bug track this, please let me know.

Thanks!

Comment 18 Vitaly Kuznetsov 2020-01-04 15:50:57 UTC
(In reply to Qinghua Cheng from comment #17)
> Hi Paolo,
> 
> When I tried to verify this bug, I got qemu-kvm core dump on host. 
> 
> QEMU 4.2.0 monitor - type 'help' for more information
> (qemu) qemu-kvm: error: failed to set MSR 0x48e to 0xfff9fffe04006172
> qemu-kvm: /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2947:
> kvm_put_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.

This looks a lot like https://bugzilla.redhat.com/show_bug.cgi?id=1786288

Comment 20 Vitaly Kuznetsov 2020-01-10 17:34:38 UTC
(In reply to Vitaly Kuznetsov from comment #18)
> (In reply to Qinghua Cheng from comment #17)
> > Hi Paolo,
> > 
> > When I tried to verify this bug, I got qemu-kvm core dump on host. 
> > 
> > QEMU 4.2.0 monitor - type 'help' for more information
> > (qemu) qemu-kvm: error: failed to set MSR 0x48e to 0xfff9fffe04006172
> > qemu-kvm: /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2947:
> > kvm_put_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
> 
> This looks a lot like https://bugzilla.redhat.com/show_bug.cgi?id=1786288

Actually no, BZ#1786288 is about 'hv_evmcs' flag and you don't seem to have it on
your command line. Could you please try to verify this bug with an older QEMU
build (4.1) and file a new BZ for the https://bugzilla.redhat.com/show_bug.cgi?id=1738741#c17
issue? Please provide your /proc/cpuinfo.

Comment 21 Qinghua Cheng 2020-01-13 08:33:49 UTC
This bug is verified on 4.18.0-168.el8.x86_64, QEMU emulator version 4.1.0 (qemu-kvm-4.1.0-14.module+el8.2.0+4677+51176c2e). After migration, l1 and l2 guests work well. 

Issue in Comment 17 is tracked in https://bugzilla.redhat.com/show_bug.cgi?id=1790308.

Comment 23 errata-xmlrpc 2020-04-28 16:23:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:1769


Note You need to log in before you can comment on or make changes to this bug.