Bug 1847070 - vmi cannot be scheduled , qemu-kvm core dump
Summary: vmi cannot be scheduled , qemu-kvm core dump
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Virtualization
Version: 2.4.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 2.4.0
Assignee: sgott
QA Contact: Israel Pinto
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-15 15:09 UTC by Israel Pinto
Modified: 2020-09-16 17:50 UTC (History)
10 users (show)

Fixed In Version: Kernel: 4.18.0-193.9.1.el8_2.x86_64, cri-o:1.18.1-13.dev.rhaos4.5.git6d00f64.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-28 19:10:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
virt-launcher logs (1.31 MB, application/zip)
2020-06-18 10:00 UTC, Israel Pinto
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 5167421 0 None None None 2020-06-18 13:27:59 UTC
Red Hat Product Errata RHSA-2020:3194 0 None None None 2020-07-28 19:10:48 UTC

Internal Links: 1879646

Description Israel Pinto 2020-06-15 15:09:17 UTC
Description of problem:
On BM environment VMI stuck on Scheduled, VM can't start 
See below error.

NOTE:
With Kernel fix for RHEL 8.2,  kernel-4.18.0-179.el8
(https://bugzilla.redhat.com/show_bug.cgi?id=1786288)
VM is running.  


Version-Release number of selected component (if applicable):
OCP: Red Hat Enterprise Linux CoreOS 45.81.202006051300-0 (Ootpa)  
Kernel: 4.18.0-147.8.1.rt24.101.el8_1.x86_64  
CRIIO: cri-o://1.18.1-5.dev.rhaos4.5.git5e39296.el8

# rpm -qa | grep qemu
ipxe-roms-qemu-20181214-5.git133f4c47.el8.noarch
qemu-kvm-common-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
qemu-kvm-block-gluster-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
qemu-kvm-block-ssh-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
qemu-kvm-block-iscsi-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
qemu-kvm-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
libvirt-daemon-driver-qemu-6.0.0-17.3.module+el8.2.0+6907+6abdb1b6.x86_64
qemu-kvm-block-curl-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
qemu-kvm-block-rbd-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
qemu-kvm-core-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
qemu-img-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64

# rpm -qa | grep libvirt
libvirt-libs-6.0.0-17.3.module+el8.2.0+6907+6abdb1b6.x86_64
libvirt-bash-completion-6.0.0-17.3.module+el8.2.0+6907+6abdb1b6.x86_64
libvirt-daemon-driver-qemu-6.0.0-17.3.module+el8.2.0+6907+6abdb1b6.x86_64
libvirt-daemon-6.0.0-17.3.module+el8.2.0+6907+6abdb1b6.x86_64
libvirt-client-6.0.0-17.3.module+el8.2.0+6907+6abdb1b6.x86_64
libvirt-daemon-driver-storage-core-6.0.0-17.3.module+el8.2.0+6907+6abdb1b6.x86_64


How reproducible:
100%

Steps to Reproduce:
1. Create VM see spec below.
2. Start VM

Actual results:
from vmi: 
Warning  SyncFailed        7s (x16 over 26s)  virt-handler, cnvqe-10.lab.eng.tlv2.redhat.com  (combined from similar events): server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: qemu unexpectedly closed the monitor: 2020-06-15T14:25:10.891292Z qemu-kvm: error: failed to set MSR 0x48e to 0xfff9fffe04006172\nqemu-kvm: /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"


Expected results:
VM Running


 
VM spec:
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
  labels:
    app: vm-fedora-cdisk
    kubevirt-vm: vm-fedora-cdisk
  name: vm-fedora-cdisk
spec:
  running: false
  template:
    metadata:
      labels:
        kubevirt-vm: vm-fedora-cdisk
    spec:
      #nodeSelector:
       # kubernetes.io/hostname: f25-h03-000-r730xd.rdu2.scalelab.redhat.com
      domain:
        cpu:
          cores: 1
        devices:
          disks:
            - disk:
                bus: virtio
              name: containerdisk
            - disk:
                bus: virtio
              name: cloudinitdisk
        machine:
          type: q35
        resources:
          requests:
            memory: 2Gi
      terminationGracePeriodSeconds: 0
      volumes:
        - containerDisk:
            image: kubevirt/fedora-cloud-container-disk-demo:v0.30.0
          name: containerdisk
        - cloudInitNoCloud:
            userData: |-
              #cloud-config
              password: fedora
              chpasswd: { expire: False }
          name: cloudinitdisk
status: {}

Comment 2 Dr. David Alan Gilbert 2020-06-15 16:10:47 UTC
So if I understand what we have we have:
   qemu-kvm-common-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64    which is 8.2.0
   
   4.18.0-147.8.1.rt24.101.el8_1.x86_64    which is 8.1

so a 8.2 qemu on an 8.1 kernel;  I think technically that's 'unsupported'.

Comment 5 Dan Kenigsberg 2020-06-15 18:40:03 UTC
Due to the distributed nature of OpenShift, I believe that we are bound to live the possibility of an old kernel running a new userspace. For example, ocp may upgrade successfully and upgrade cnv, except for a single node with a choppy network. Yes, that ought to be a transient situation, but everything in life is, and we should cope with that.

Note that for a single customer we are going to support 8.2 userspace on el7 kernel. CNV depends on the strong guarantees of the userspace/kernel ABI.

To Fabian's question: we use 8.2 userland only because we expected 8.2 kernel by now. OCP 4.5 decided to hold it's adoption of 8.2 for a few weeks, but it should release on 8.2.

Comment 8 Dan Kenigsberg 2020-06-16 09:36:50 UTC
> if a host lost network connection, rhcos & cnv wont be upgraded leaving them both with a matching version

If this the fact, we have a much worse bug. A single worker with temporary(?) communication problems must not block cluster upgrade.

Comment 10 Dan Kenigsberg 2020-06-16 09:58:23 UTC
I don't understand your question, Nelly. I take it as a given, that we can temporarily have qemu and kernel of different rhel batches. Even in these circumstances, qemu coredumping is an indicator of a qemu bug.

Comment 14 Israel Pinto 2020-06-18 10:00:47 UTC
Created attachment 1697936 [details]
virt-launcher logs

Comment 15 Israel Pinto 2020-06-18 10:07:08 UTC
(In reply to Eduardo Habkost from comment #12)
> (In reply to Israel Pinto from comment #11)
> > Update I set host model on VM spec (with RHEL 8,1 kernel):
> > domain:
> >         cpu:
> >           model: Skylake-Server
> 
> What was the default CPU model before?  I couldn't reproduce it here, but
> maybe it's because my CPU configuration is not similar to the one generated
> by CNV.

Since VM is crashing fast i can't get the dumpxml.
I attached virt-launcher with libvirt logs in debug (using : https://github.com/kubevirt/kubevirt/pull/2571)
Hope it's helps.

Comment 16 Dr. David Alan Gilbert 2020-06-18 10:52:51 UTC
from logs-1 that's the same error:

{
  "component": "virt-launcher",
  "kind": "",
  "level": "error",
  "msg": "Starting the VirtualMachineInstance failed.",
  "name": "vm-fedora-cdisk",
  "namespace": "default",
  "pos": "manager.go:1236",
  "reason": "virError(Code=1, Domain=10, Message='internal error: process exited while connecting to monitor: 2020-06-18T09:40:44.807589Z qemu-kvm: error: failed to set MSR 0x48e to 0xfff9fffe04006172\nqemu-kvm: /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')",
  "timestamp": "2020-06-18T09:40:45.401593Z",
  "uid": "c6d9c690-e42c-4efc-a90e-52d34b570606"
}

Comment 17 Eduardo Habkost 2020-06-22 22:26:43 UTC
(In reply to Dr. David Alan Gilbert from comment #16)
> from logs-1 that's the same error:
> 
> {
>   "component": "virt-launcher",
>   "kind": "",
>   "level": "error",
>   "msg": "Starting the VirtualMachineInstance failed.",
>   "name": "vm-fedora-cdisk",
>   "namespace": "default",
>   "pos": "manager.go:1236",
>   "reason": "virError(Code=1, Domain=10, Message='internal error: process
> exited while connecting to monitor: 2020-06-18T09:40:44.807589Z qemu-kvm:
> error: failed to set MSR 0x48e to 0xfff9fffe04006172\nqemu-kvm:
> /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs:
> Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')",
>   "timestamp": "2020-06-18T09:40:45.401593Z",
>   "uid": "c6d9c690-e42c-4efc-a90e-52d34b570606"
> }


For reference, below are the log entries for the domain XML and command line for the crash above.

{
  "component": "virt-launcher",
  "level": "info",
  "msg": "conn=0x7f9aa8001560, xml=<domain type=\"kvm\" xmlns:qemu=\"http://libvirt.org/schemas/domain/qemu/1.0\"><name>default_vm-fedora-cdisk</name><memory unit=\"B\">2147483648</memory><os><type arch=\"x86_64\" machine=\"q35\">hvm</type><smbios mode=\"sysinfo\"></smbios></os><sysinfo type=\"smbios\"><system><entry name=\"uuid\">eb525420-672a-526a-a57e-8a49466fecdb</entry><entry name=\"manufacturer\">Red Hat</entry><entry name=\"family\">Red Hat</entry><entry name=\"product\">Container-native virtualization</entry><entry name=\"sku\">2.4.0</entry><entry name=\"version\">2.4.0</entry></system><bios></bios><baseBoard></baseBoard><chassis></chassis></sysinfo><devices><interface type=\"bridge\"><source bridge=\"k6t-eth0\"></source><model type=\"virtio\"></model><alias name=\"ua-default\"></alias></interface><channel type=\"unix\"><target name=\"org.qemu.guest_agent.0\" type=\"virtio\"></target></channel><controller type=\"usb\" index=\"0\" model=\"none\"></controller><controller type=\"virtio-serial\" index=\"0\"></controller><video><model type=\"vga\" heads=\"1\" vram=\"16384\"></model></video><graphics type=\"vnc\"><listen type=\"socket\" socket=\"/var/run/kubevirt-private/c6d9c690-e42c-4efc-a90e-52d34b570606/virt-vnc\"></listen></graphics><memballoon model=\"none\"></memballoon><disk device=\"disk\" type=\"file\"><source file=\"/var/run/kubevirt-ephemeral-disks/disk-data/containerdisk/disk.qcow2\"></source><target bus=\"virtio\" dev=\"vda\"></target><driver name=\"qemu\" type=\"qcow2\"></driver><alias name=\"ua-containerdisk\"></alias><backingStore type=\"file\"><format type=\"qcow2\"></format><source file=\"/var/run/kubevirt/container-disks/disk_0.img\"></source></backingStore></disk><disk device=\"disk\" type=\"file\"><source file=\"/var/run/kubevirt-ephemeral-disks/cloud-init-data/default/vm-fedora-cdisk/noCloud.iso\"></source><target bus=\"virtio\" dev=\"vdb\"></target><driver name=\"qemu\" type=\"raw\"></driver><alias name=\"ua-cloudinitdisk\"></alias></disk><serial type=\"unix\"><target port=\"0\"></target><source mode=\"bind\" path=\"/var/run/kubevirt-private/c6d9c690-e42c-4efc-a90e-52d34b570606/virt-serial0\"></source></serial><console type=\"pty\"><target type=\"serial\" port=\"0\"></target></console></devices><metadata><kubevirt xmlns=\"http://kubevirt.io\"><uid>c6d9c690-e42c-4efc-a90e-52d34b570606</uid><graceperiod><deletionGracePeriodSeconds>0</deletionGracePeriodSeconds></graceperiod></kubevirt></metadata><features><acpi></acpi></features><cpu mode=\"host-model\"><topology sockets=\"1\" cores=\"1\" threads=\"1\"></topology></cpu><vcpu placement=\"static\">1</vcpu><iothreads>1</iothreads></domain>",
  "pos": "virDomainDefineXML:6146",
  "subcomponent": "libvirt",
  "thread": "49",
  "timestamp": "2020-06-18T09:40:44.554000Z"
}
{
  "component": "virt-launcher",
  "level": "info",
  "msg": "About to run LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOME=/var/lib/libvirt/qemu/domain-37-default_vm-fedora-cd XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-37-default_vm-fedora-cd/.local/share XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-37-default_vm-fedora-cd/.cache XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-37-default_vm-fedora-cd/.config QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=default_vm-fedora-cdisk,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-37-default_vm-fedora-cd/master-key.aes -machine pc-q35-rhel8.2.0,accel=kvm,usb=off,dump-guest-core=off -cpu Cascadelake-Server,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,tsx-ctrl=on -m 2048 -overcommit mem-lock=off -smp 1,sockets=1,dies=1,cores=1,threads=1 -object iothread,id=iothread1 -uuid eb525420-672a-526a-a57e-8a49466fecdb -smbios 'type=1,manufacturer=Red Hat,product=Container-native virtualization,version=2.4.0,uuid=eb525420-672a-526a-a57e-8a49466fecdb,sku=2.4.0,family=Red Hat' -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=20,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device virtio-serial-pci,id=virtio-serial0,bus=pci.2,addr=0x0 -blockdev '{\"driver\":\"file\",\"filename\":\"/var/run/kubevirt/container-disks/disk_0.img\",\"node-name\":\"libvirt-3-storage\",\"auto-read-only\":true,\"discard\":\"unmap\"}' -blockdev '{\"node-name\":\"libvirt-3-format\",\"read-only\":true,\"driver\":\"qcow2\",\"file\":\"libvirt-3-storage\",\"backing\":null}' -blockdev '{\"driver\":\"file\",\"filename\":\"/var/run/kubevirt-ephemeral-disks/disk-data/containerdisk/disk.qcow2\",\"node-name\":\"libvirt-2-storage\",\"auto-read-only\":true,\"discard\":\"unmap\"}' -blockdev '{\"node-name\":\"libvirt-2-format\",\"read-only\":false,\"driver\":\"qcow2\",\"file\":\"libvirt-2-storage\",\"backing\":\"libvirt-3-format\"}' -device virtio-blk-pci,scsi=off,bus=pci.3,addr=0x0,drive=libvirt-2-format,id=ua-containerdisk,bootindex=1 -blockdev '{\"driver\":\"file\",\"filename\":\"/var/run/kubevirt-ephemeral-disks/cloud-init-data/default/vm-fedora-cdisk/noCloud.iso\",\"node-name\":\"libvirt-1-storage\",\"auto-read-only\":true,\"discard\":\"unmap\"}' -blockdev '{\"node-name\":\"libvirt-1-format\",\"read-only\":false,\"driver\":\"raw\",\"file\":\"libvirt-1-storage\"}' -device virtio-blk-pci,scsi=off,bus=pci.4,addr=0x0,drive=libvirt-1-format,id=ua-cloudinitdisk -netdev tap,fd=22,id=hostua-default,vhost=on,vhostfd=23 -device virtio-net-pci,netdev=hostua-default,id=ua-default,mac=52:54:00:38:09:76,bus=pci.1,addr=0x0 -chardev socket,id=charserial0,fd=24,server,nowait -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=25,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -vnc vnc=unix:/var/run/kubevirt-private/c6d9c690-e42c-4efc-a90e-52d34b570606/virt-vnc -device VGA,id=video0,vgamem_mb=16,bus=pcie.0,addr=0x1 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on",
  "pos": "virCommandRunAsync:2588",
  "subcomponent": "libvirt",
  "thread": "46",
  "timestamp": "2020-06-18T09:40:44.744000Z"
}



Unescaped strings:

conn=0x7f9aa8001560, xml=<domain type="kvm" xmlns:qemu="http://libvirt.org/schemas/domain/qemu/1.0"><name>default_vm-fedora-cdisk</name><memory unit="B">2147483648</memory><os><type arch="x86_64" machine="q35">hvm</type><smbios mode="sysinfo"></smbios></os><sysinfo type="smbios"><system><entry name="uuid">eb525420-672a-526a-a57e-8a49466fecdb</entry><entry name="manufacturer">Red Hat</entry><entry name="family">Red Hat</entry><entry name="product">Container-native virtualization</entry><entry name="sku">2.4.0</entry><entry name="version">2.4.0</entry></system><bios></bios><baseBoard></baseBoard><chassis></chassis></sysinfo><devices><interface type="bridge"><source bridge="k6t-eth0"></source><model type="virtio"></model><alias name="ua-default"></alias></interface><channel type="unix"><target name="org.qemu.guest_agent.0" type="virtio"></target></channel><controller type="usb" index="0" model="none"></controller><controller type="virtio-serial" index="0"></controller><video><model type="vga" heads="1" vram="16384"></model></video><graphics type="vnc"><listen type="socket" socket="/var/run/kubevirt-private/c6d9c690-e42c-4efc-a90e-52d34b570606/virt-vnc"></listen></graphics><memballoon model="none"></memballoon><disk device="disk" type="file"><source file="/var/run/kubevirt-ephemeral-disks/disk-data/containerdisk/disk.qcow2"></source><target bus="virtio" dev="vda"></target><driver name="qemu" type="qcow2"></driver><alias name="ua-containerdisk"></alias><backingStore type="file"><format type="qcow2"></format><source file="/var/run/kubevirt/container-disks/disk_0.img"></source></backingStore></disk><disk device="disk" type="file"><source file="/var/run/kubevirt-ephemeral-disks/cloud-init-data/default/vm-fedora-cdisk/noCloud.iso"></source><target bus="virtio" dev="vdb"></target><driver name="qemu" type="raw"></driver><alias name="ua-cloudinitdisk"></alias></disk><serial type="unix"><target port="0"></target><source mode="bind" path="/var/run/kubevirt-private/c6d9c690-e42c-4efc-a90e-52d34b570606/virt-serial0"></source></serial><console type="pty"><target type="serial" port="0"></target></console></devices><metadata><kubevirt xmlns="http://kubevirt.io"><uid>c6d9c690-e42c-4efc-a90e-52d34b570606</uid><graceperiod><deletionGracePeriodSeconds>0</deletionGracePeriodSeconds></graceperiod></kubevirt></metadata><features><acpi></acpi></features><cpu mode="host-model"><topology sockets="1" cores="1" threads="1"></topology></cpu><vcpu placement="static">1</vcpu><iothreads>1</iothreads></domain>
About to run LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOME=/var/lib/libvirt/qemu/domain-37-default_vm-fedora-cd XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-37-default_vm-fedora-cd/.local/share XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-37-default_vm-fedora-cd/.cache XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-37-default_vm-fedora-cd/.config QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=default_vm-fedora-cdisk,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-37-default_vm-fedora-cd/master-key.aes -machine pc-q35-rhel8.2.0,accel=kvm,usb=off,dump-guest-core=off -cpu Cascadelake-Server,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,tsx-ctrl=on -m 2048 -overcommit mem-lock=off -smp 1,sockets=1,dies=1,cores=1,threads=1 -object iothread,id=iothread1 -uuid eb525420-672a-526a-a57e-8a49466fecdb -smbios 'type=1,manufacturer=Red Hat,product=Container-native virtualization,version=2.4.0,uuid=eb525420-672a-526a-a57e-8a49466fecdb,sku=2.4.0,family=Red Hat' -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=20,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 -device virtio-serial-pci,id=virtio-serial0,bus=pci.2,addr=0x0 -blockdev '{"driver":"file","filename":"/var/run/kubevirt/container-disks/disk_0.img","node-name":"libvirt-3-storage","auto-read-only":true,"discard":"unmap"}' -blockdev '{"node-name":"libvirt-3-format","read-only":true,"driver":"qcow2","file":"libvirt-3-storage","backing":null}' -blockdev '{"driver":"file","filename":"/var/run/kubevirt-ephemeral-disks/disk-data/containerdisk/disk.qcow2","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"}' -blockdev '{"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":"libvirt-3-format"}' -device virtio-blk-pci,scsi=off,bus=pci.3,addr=0x0,drive=libvirt-2-format,id=ua-containerdisk,bootindex=1 -blockdev '{"driver":"file","filename":"/var/run/kubevirt-ephemeral-disks/cloud-init-data/default/vm-fedora-cdisk/noCloud.iso","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' -blockdev '{"node-name":"libvirt-1-format","read-only":false,"driver":"raw","file":"libvirt-1-storage"}' -device virtio-blk-pci,scsi=off,bus=pci.4,addr=0x0,drive=libvirt-1-format,id=ua-cloudinitdisk -netdev tap,fd=22,id=hostua-default,vhost=on,vhostfd=23 -device virtio-net-pci,netdev=hostua-default,id=ua-default,mac=52:54:00:38:09:76,bus=pci.1,addr=0x0 -chardev socket,id=charserial0,fd=24,server,nowait -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=25,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -vnc vnc=unix:/var/run/kubevirt-private/c6d9c690-e42c-4efc-a90e-52d34b570606/virt-vnc -device VGA,id=video0,vgamem_mb=16,bus=pcie.0,addr=0x1 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on


Formatted XML:

<?xml version="1.0"?>
<domain xmlns:qemu="http://libvirt.org/schemas/domain/qemu/1.0" type="kvm">
  <name>default_vm-fedora-cdisk</name>
  <memory unit="B">2147483648</memory>
  <os>
    <type arch="x86_64" machine="q35">hvm</type>
    <smbios mode="sysinfo"/>
  </os>
  <sysinfo type="smbios">
    <system>
      <entry name="uuid">eb525420-672a-526a-a57e-8a49466fecdb</entry>
      <entry name="manufacturer">Red Hat</entry>
      <entry name="family">Red Hat</entry>
      <entry name="product">Container-native virtualization</entry>
      <entry name="sku">2.4.0</entry>
      <entry name="version">2.4.0</entry>
    </system>
    <bios/>
    <baseBoard/>
    <chassis/>
  </sysinfo>
  <devices>
    <interface type="bridge">
      <source bridge="k6t-eth0"/>
      <model type="virtio"/>
      <alias name="ua-default"/>
    </interface>
    <channel type="unix">
      <target name="org.qemu.guest_agent.0" type="virtio"/>
    </channel>
    <controller type="usb" index="0" model="none"/>
    <controller type="virtio-serial" index="0"/>
    <video>
      <model type="vga" heads="1" vram="16384"/>
    </video>
    <graphics type="vnc">
      <listen type="socket" socket="/var/run/kubevirt-private/c6d9c690-e42c-4efc-a90e-52d34b570606/virt-vnc"/>
    </graphics>
    <memballoon model="none"/>
    <disk device="disk" type="file">
      <source file="/var/run/kubevirt-ephemeral-disks/disk-data/containerdisk/disk.qcow2"/>
      <target bus="virtio" dev="vda"/>
      <driver name="qemu" type="qcow2"/>
      <alias name="ua-containerdisk"/>
      <backingStore type="file">
        <format type="qcow2"/>
        <source file="/var/run/kubevirt/container-disks/disk_0.img"/>
      </backingStore>
    </disk>
    <disk device="disk" type="file">
      <source file="/var/run/kubevirt-ephemeral-disks/cloud-init-data/default/vm-fedora-cdisk/noCloud.iso"/>
      <target bus="virtio" dev="vdb"/>
      <driver name="qemu" type="raw"/>
      <alias name="ua-cloudinitdisk"/>
    </disk>
    <serial type="unix">
      <target port="0"/>
      <source mode="bind" path="/var/run/kubevirt-private/c6d9c690-e42c-4efc-a90e-52d34b570606/virt-serial0"/>
    </serial>
    <console type="pty">
      <target type="serial" port="0"/>
    </console>
  </devices>
  <metadata>
    <kubevirt xmlns="http://kubevirt.io">
      <uid>c6d9c690-e42c-4efc-a90e-52d34b570606</uid>
      <graceperiod>
        <deletionGracePeriodSeconds>0</deletionGracePeriodSeconds>
      </graceperiod>
    </kubevirt>
  </metadata>
  <features>
    <acpi/>
  </features>
  <cpu mode="host-model">
    <topology sockets="1" cores="1" threads="1"/>
  </cpu>
  <vcpu placement="static">1</vcpu>
  <iothreads>1</iothreads>
</domain>

Comment 18 sgott 2020-06-24 11:55:45 UTC
We should have a compatible kernel version now. Moving this BZ to ON_QE

Comment 19 Israel Pinto 2020-06-24 15:19:47 UTC
Verify with: 
Worker: 
Red Hat Enterprise Linux CoreOS 45.82.202006190229-0 (Ootpa)   
Verify with:
oc get kv -n openshift-cnv -o yaml | grep operatorV
              operatorVersion: v0.30.1

Worker:
Red Hat Enterprise Linux CoreOS 45.82.202006190229-0 (Ootpa)
Kernel: 4.18.0-193.9.1.el8_2.x86_64   
cri-o:1.18.1-13.dev.rhaos4.5.git6d00f64.el8

Virt-launcher pod:
Libvirt:
libvirt-libs-6.0.0-17.3.module+el8.2.0+6907+6abdb1b6.x86_64
libvirt-bash-completion-6.0.0-17.3.module+el8.2.0+6907+6abdb1b6.x86_64
libvirt-daemon-driver-qemu-6.0.0-17.3.module+el8.2.0+6907+6abdb1b6.x86_64
libvirt-daemon-6.0.0-17.3.module+el8.2.0+6907+6abdb1b6.x86_64
libvirt-client-6.0.0-17.3.module+el8.2.0+6907+6abdb1b6.x86_64
libvirt-daemon-driver-storage-core-6.0.0-17.3.module+el8.2.0+6907+6abdb1b6.x86_64
qemu:
ipxe-roms-qemu-20181214-5.git133f4c47.el8.noarch
qemu-kvm-common-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
qemu-kvm-block-gluster-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
qemu-kvm-block-ssh-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
qemu-kvm-block-iscsi-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
qemu-kvm-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
qemu-kvm-block-curl-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
qemu-kvm-block-rbd-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
qemu-kvm-core-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64
qemu-img-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64



Create VM with container disk and VM with PVC 
Both are running.

Comment 22 errata-xmlrpc 2020-07-28 19:10:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:3194

Comment 23 Andrea Cervesato 2020-09-16 17:27:43 UTC
We still have the same problem on: 4.18.0-193.14.3.el8_2.x86_64.

# oc version
Server Version: 4.5.8
Kubernetes Version: v1.18.3+6c42de8

# OCP Node version and info
Operating System
Linux
OS Image
Red Hat Enterprise Linux CoreOS 45.82.202008290529-0 (Ootpa)
Architecture
AMD64
Kernel Version
4.18.0-193.14.3.el8_2.x86_64
Boot ID
91b5f4f6-93c4-44a1-a464-84656fc80a12
Container Runtime
cri-o://1.18.3-11.rhaos4.5.gite5bcc71.el8
Kubelet Version
v1.18.3+6c42de8
Kube-Proxy Version
v1.18.3+6c42de8



The events cite:
```
0s          Warning   SyncFailed                     virtualmachineinstance/vm-example                               (combined from similar events): server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: qemu unexpectedly closed the monitor: 2020-09-16T17:21:36.625311Z qemu-kvm: error: failed to set MSR 0xe1 to 0x0\nqemu-kvm: /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
0s          Warning   SyncFailed                     virtualmachineinstance/vm-example                               (combined from similar events): server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: process exited while connecting to monitor: 2020-09-16T17:21:37.420617Z qemu-kvm: error: failed to set MSR 0xe1 to 0x0\nqemu-kvm: /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
0s          Warning   SyncFailed                     virtualmachineinstance/vm-example                               (combined from similar events): server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: process exited while connecting to monitor: 2020-09-16T17:21:38.211343Z qemu-kvm: error: failed to set MSR 0xe1 to 0x0\nqemu-kvm: /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
0s          Warning   SyncFailed                     virtualmachineinstance/vm-example                               (combined from similar events): server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: qemu unexpectedly closed the monitor: 2020-09-16T17:21:39.013171Z qemu-kvm: error: failed to set MSR 0xe1 to 0x0\nqemu-kvm: /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
0s          Warning   SyncFailed                     virtualmachineinstance/vm-example                               (combined from similar events): server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: qemu unexpectedly closed the monitor: 2020-09-16T17:21:39.780053Z qemu-kvm: error: failed to set MSR 0xe1 to 0x0\nqemu-kvm: /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
0s          Warning   SyncFailed                     virtualmachineinstance/vm-example                               (combined from similar events): server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: process exited while connecting to monitor: 2020-09-16T17:21:40.612615Z qemu-kvm: error: failed to set MSR 0xe1 to 0x0\nqemu-kvm: /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
0s          Warning   SyncFailed                     virtualmachineinstance/vm-example                               (combined from similar events): server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: qemu unexpectedly closed the monitor: 2020-09-16T17:21:41.438211Z qemu-kvm: error: failed to set MSR 0xe1 to 0x0\nqemu-kvm: /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
0s          Warning   SyncFailed                     virtualmachineinstance/vm-example                               (combined from similar events): server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: process exited while connecting to monitor: 2020-09-16T17:21:42.026847Z qemu-kvm: error: failed to set MSR 0xe1 to 0x0\nqemu-kvm: /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
0s          Warning   SyncFailed                     virtualmachineinstance/vm-example                               (combined from similar events): server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: qemu unexpectedly closed the monitor: 2020-09-16T17:21:42.780289Z qemu-kvm: error: failed to set MSR 0xe1 to 0x0\nqemu-kvm: /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
0s          Warning   SyncFailed                     virtualmachineinstance/vm-example                               (combined from similar events): server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: qemu unexpectedly closed the monitor: 2020-09-16T17:21:43.551149Z qemu-kvm: error: failed to set MSR 0xe1 to 0x0\nqemu-kvm: /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
0s          Warning   SyncFailed                     virtualmachineinstance/vm-example                               (combined from similar events): server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: qemu unexpectedly closed the monitor: 2020-09-16T17:21:44.312084Z qemu-kvm: error: failed to set MSR 0xe1 to 0x0\nqemu-kvm: /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
```

Comment 24 Dr. David Alan Gilbert 2020-09-16 17:35:31 UTC
(In reply to Andrea Cervesato from comment #23)
> We still have the same problem on: 4.18.0-193.14.3.el8_2.x86_64.

No you don't - it's a different error.

   error: failed to set MSR 0xe1 to 0x0
That's a different MSR.

Is this running nested on AMD, if so, what is the host kernel/qemu/CPU and configuration :

Dave

> # oc version
> Server Version: 4.5.8
> Kubernetes Version: v1.18.3+6c42de8
> 
> # OCP Node version and info
> Operating System
> Linux
> OS Image
> Red Hat Enterprise Linux CoreOS 45.82.202008290529-0 (Ootpa)
> Architecture
> AMD64
> Kernel Version
> 4.18.0-193.14.3.el8_2.x86_64
> Boot ID
> 91b5f4f6-93c4-44a1-a464-84656fc80a12
> Container Runtime
> cri-o://1.18.3-11.rhaos4.5.gite5bcc71.el8
> Kubelet Version
> v1.18.3+6c42de8
> Kube-Proxy Version
> v1.18.3+6c42de8
> 
> 
> 
> The events cite:
> ```
> 0s          Warning   SyncFailed                    
> virtualmachineinstance/vm-example                               (combined
> from similar events): server error. command SyncVMI failed:
> "LibvirtError(Code=1, Domain=10, Message='internal error: qemu unexpectedly
> closed the monitor: 2020-09-16T17:21:36.625311Z qemu-kvm: error: failed to
> set MSR 0xe1 to 0x0\nqemu-kvm:
> /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs:
> Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
> 0s          Warning   SyncFailed                    
> virtualmachineinstance/vm-example                               (combined
> from similar events): server error. command SyncVMI failed:
> "LibvirtError(Code=1, Domain=10, Message='internal error: process exited
> while connecting to monitor: 2020-09-16T17:21:37.420617Z qemu-kvm: error:
> failed to set MSR 0xe1 to 0x0\nqemu-kvm:
> /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs:
> Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
> 0s          Warning   SyncFailed                    
> virtualmachineinstance/vm-example                               (combined
> from similar events): server error. command SyncVMI failed:
> "LibvirtError(Code=1, Domain=10, Message='internal error: process exited
> while connecting to monitor: 2020-09-16T17:21:38.211343Z qemu-kvm: error:
> failed to set MSR 0xe1 to 0x0\nqemu-kvm:
> /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs:
> Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
> 0s          Warning   SyncFailed                    
> virtualmachineinstance/vm-example                               (combined
> from similar events): server error. command SyncVMI failed:
> "LibvirtError(Code=1, Domain=10, Message='internal error: qemu unexpectedly
> closed the monitor: 2020-09-16T17:21:39.013171Z qemu-kvm: error: failed to
> set MSR 0xe1 to 0x0\nqemu-kvm:
> /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs:
> Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
> 0s          Warning   SyncFailed                    
> virtualmachineinstance/vm-example                               (combined
> from similar events): server error. command SyncVMI failed:
> "LibvirtError(Code=1, Domain=10, Message='internal error: qemu unexpectedly
> closed the monitor: 2020-09-16T17:21:39.780053Z qemu-kvm: error: failed to
> set MSR 0xe1 to 0x0\nqemu-kvm:
> /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs:
> Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
> 0s          Warning   SyncFailed                    
> virtualmachineinstance/vm-example                               (combined
> from similar events): server error. command SyncVMI failed:
> "LibvirtError(Code=1, Domain=10, Message='internal error: process exited
> while connecting to monitor: 2020-09-16T17:21:40.612615Z qemu-kvm: error:
> failed to set MSR 0xe1 to 0x0\nqemu-kvm:
> /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs:
> Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
> 0s          Warning   SyncFailed                    
> virtualmachineinstance/vm-example                               (combined
> from similar events): server error. command SyncVMI failed:
> "LibvirtError(Code=1, Domain=10, Message='internal error: qemu unexpectedly
> closed the monitor: 2020-09-16T17:21:41.438211Z qemu-kvm: error: failed to
> set MSR 0xe1 to 0x0\nqemu-kvm:
> /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs:
> Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
> 0s          Warning   SyncFailed                    
> virtualmachineinstance/vm-example                               (combined
> from similar events): server error. command SyncVMI failed:
> "LibvirtError(Code=1, Domain=10, Message='internal error: process exited
> while connecting to monitor: 2020-09-16T17:21:42.026847Z qemu-kvm: error:
> failed to set MSR 0xe1 to 0x0\nqemu-kvm:
> /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs:
> Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
> 0s          Warning   SyncFailed                    
> virtualmachineinstance/vm-example                               (combined
> from similar events): server error. command SyncVMI failed:
> "LibvirtError(Code=1, Domain=10, Message='internal error: qemu unexpectedly
> closed the monitor: 2020-09-16T17:21:42.780289Z qemu-kvm: error: failed to
> set MSR 0xe1 to 0x0\nqemu-kvm:
> /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs:
> Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
> 0s          Warning   SyncFailed                    
> virtualmachineinstance/vm-example                               (combined
> from similar events): server error. command SyncVMI failed:
> "LibvirtError(Code=1, Domain=10, Message='internal error: qemu unexpectedly
> closed the monitor: 2020-09-16T17:21:43.551149Z qemu-kvm: error: failed to
> set MSR 0xe1 to 0x0\nqemu-kvm:
> /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs:
> Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
> 0s          Warning   SyncFailed                    
> virtualmachineinstance/vm-example                               (combined
> from similar events): server error. command SyncVMI failed:
> "LibvirtError(Code=1, Domain=10, Message='internal error: qemu unexpectedly
> closed the monitor: 2020-09-16T17:21:44.312084Z qemu-kvm: error: failed to
> set MSR 0xe1 to 0x0\nqemu-kvm:
> /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs:
> Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.')"
> ```

Comment 25 Andrea Cervesato 2020-09-16 17:40:24 UTC
> Is this running nested on AMD, if so, what is the host kernel/qemu/CPU and configuration :
VMware ESXi, 6.7.0, 15160138
KRPA-U16 Series
AMD EPYC 7502P 32-Core Processor


I've also opened a new bugzilla for this problem: 1879646 as seems to be similar but not correlated.

Comment 26 Dr. David Alan Gilbert 2020-09-16 17:50:42 UTC
OK, please *always* state if you're running nested virtualisation, it changes things a lot, and watch out for the MSR number since different MSR
numbers in the message normally mean different bugs.


Note You need to log in before you can comment on or make changes to this bug.