RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1843970 - Crash on GCP nested instance when using qemu-kvm-2.12.0-99.module+el8.2.0+5827+8c39933c.x86_64
Summary: Crash on GCP nested instance when using qemu-kvm-2.12.0-99.module+el8.2.0+582...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: qemu-kvm
Version: ---
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 8.0
Assignee: Amnon Ilan
QA Contact: Wei Shi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-04 14:09 UTC by Christophe Fergeau
Modified: 2023-09-15 00:32 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-12-30 12:38:23 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Christophe Fergeau 2020-06-04 14:09:57 UTC
Description of problem:
When trying to start a nested VM on a GCP instance, qemu crashes with:
2020-06-03T14:07:04.871770Z qemu-kvm: error: failed to set MSR 0x48b to 0x59ff00000000
qemu-kvm: /builddir/build/BUILD/qemu-2.12.0/target/i386/kvm.c:2119: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.


This happens with qemu-kvm-2.12.0-99.module+el8.2.0+5827+8c39933c.x86_64
Downgrading to qemu-kvm-2.12.0-88.module+el8.1.0+5708+85d8e057.3.x86_64 gets rid of the crash

The libvirt VM XML is 

$ sudo virsh dumpxml nested-8x69s-master-0
<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
  <name>nested-8x69s-master-0</name>
  <uuid>9e9ea8f4-ecdf-4173-b5d7-df239ce87d3c</uuid>
  <memory unit='KiB'>14680064</memory>
  <currentMemory unit='KiB'>14680064</currentMemory>
  <vcpu placement='static'>4</vcpu>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <cpu mode='host-passthrough' check='none'/>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='volume' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source pool='nested-8x69s' volume='nested-8x69s-master-0'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='piix3-uhci'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <interface type='network'>
      <mac address='52:54:00:9e:a9:70'/>
      <source network='nested-8x69s'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='pty'>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='spice' autoport='yes'>
      <listen type='address'/>
    </graphics>
    <video>
      <model type='cirrus' vram='16384' heads='1' primary='yes'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </memballoon>
    <rng model='virtio'>
      <backend model='random'>/dev/urandom</backend>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </rng>
  </devices>
  <qemu:commandline>
    <qemu:arg value='-fw_cfg'/>
    <qemu:arg value='name=opt/com.coreos/config,file=/var/lib/libvirt/openshift-images/nested-8x69s/nested-8x69s-master.ign'/>
  </qemu:commandline>
</domain>


$ lscpu 
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              16
On-line CPU(s) list: 0-15
Thread(s) per core:  2
Core(s) per socket:  8
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               63
Model name:          Intel(R) Xeon(R) CPU @ 2.30GHz
Stepping:            0
CPU MHz:             2300.000
BogoMIPS:            4600.00
Virtualization:      VT-x
Hypervisor vendor:   KVM
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            46080K
NUMA node0 CPU(s):   0-15
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt arat md_clear arch_capabilities

This is reliably reproducible, so more info can be gathered/more testing can be done.

Comment 1 Qinghua Cheng 2020-06-05 01:42:36 UTC
Hi Christophe,

Which kernel version is on your host ?

Thanks,
Qinghua Cheng

Comment 2 Christophe Fergeau 2020-06-05 13:52:47 UTC
It's 4.18.0-193.1.2.el8_2.x86_64

Comment 3 Qinghua Cheng 2020-06-08 07:47:02 UTC
Hi Christophe,

This crash happens on a Linux L1 guest or a windows L1 guest ?

Thanks,
Qinghua Cheng

Comment 4 Christophe Fergeau 2020-06-08 09:21:51 UTC
The kernel version I gave is for the L1 guest, so it's linux. L2 guest is also a linux guest running RHCOS.

Comment 5 John Ferlan 2020-06-09 14:44:39 UTC
Assigned to Amnon for initial triage per bz process and age of bug created or assigned to virt-maint without triage.

Could be miscategorized in General - perhaps more Machine/CPU related or maybe we need a new sub-component.

Comment 6 Qinghua Cheng 2020-06-11 03:53:02 UTC
I tried to reproduce this bug on our env:

Host:
# uname -r 
4.18.0-193.el8.x86_64
qemu-kvm build: qemu-kvm-2.12.0-99.module+el8.2.0+5827+8c39933c.x86_64

L1 guest: 
# uname -r 
4.18.0-193.1.2.el8_2.x86_64
qemu-kvm build: qemu-kvm-2.12.0-99.module+el8.2.0+5827+8c39933c.x86_64

I used the xml file in this bug to boot guest mv, both L1 guest vm and L2 guest vm are booted successfully, and work well. This bug is not reproduced in this environment.

Comment 7 Wei Shi 2020-06-11 06:56:30 UTC
Vitaly,
  It seems this is the same bug as we fixed in RHEL-AV (RHBZ#1822682)

Can NOT reproduce on
qemu-kvm-2.12.0-88.module+el8.1.0+4233+bc44be3f.x86_64
qemu-kvm -cpu host ...

Can reproduce on
qemu-kvm-2.12.0-99.module+el8.2.0+5827+8c39933c.x86_64
qemu-kvm -cpu host ...

Comment 8 Vitaly Kuznetsov 2020-06-11 10:15:31 UTC
(In reply to Wei Shi from comment #7)
> Vitaly,
>   It seems this is the same bug as we fixed in RHEL-AV (RHBZ#1822682)
> 
> Can NOT reproduce on
> qemu-kvm-2.12.0-88.module+el8.1.0+4233+bc44be3f.x86_64
> qemu-kvm -cpu host ...
> 
> Can reproduce on
> qemu-kvm-2.12.0-99.module+el8.2.0+5827+8c39933c.x86_64
> qemu-kvm -cpu host ...

Yes, most likely it's the same bug. You can check that it's not
reproducible with qemu-kvm-4.2.0-19.module+el8.2.0+6296+6b821950

Comment 10 Wei Shi 2020-06-15 03:54:32 UTC
Verified nested VM can be launched successfully with qemu-kvm-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64 on GCP

*** This bug has been marked as a duplicate of bug 1822682 ***

Comment 11 Christophe Fergeau 2020-06-16 09:45:40 UTC
Aren't qemu-kvm-2.12.0-99.module+el8.2.0+5827+8c39933c.x86_64 and qemu-kvm-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64 in different streams? Is there a bug tracking a backport to qemu-kvm-2.12.0 for the fix which was done in qemu-kvm-4.2.0?

Comment 17 Red Hat Bugzilla 2023-09-15 00:32:28 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.