Description of problem: Clone a Fedora or RHEL disk image from source PVC then create a VM with this DV, after VM running, access it via console, it's repeating the following in a loop. (Cirros works well) [ 4.407476] Code: 02 00 73 0c 65 48 0f b3 1c 25 d6 08 02 00 eb 05 49 0f ba ee 3f 41 0f 22 de e9 cb f7 ff ff fc 56 48 8b 74 24 08 48 89 7c 24 08 <52> 31 d2 51 31 c9 50 41 50 45 31 c0 41 51 45 31 c9 41 52 45 31 d2 [ 4.407476] RSP: 0000:fffffe0000006000 EFLAGS: 00010086 [ 4.407476] RAX: aaaaaaaaaaaaaaab RBX: 0000000000000000 RCX: 000000000000000c [ 4.407476] RDX: 0000000000000728 RSI: ffffffff99a00ffa RDI: fffffe0000006050 [ 4.407476] RBP: ffffffff994b6411 R08: ffffffff9996d010 R09: 0000000000000000 [ 4.407476] R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000000 [ 4.407476] R13: 0000000000000003 R14: ffff8b430318be80 R15: 0000000000000000 [ 4.407476] FS: 0000000000000000(0000) GS:ffff8b4303600000(0000) knlGS:0000000000000000 [ 4.407476] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.407476] CR2: fffffe0000005ff8 CR3: 000000000220e001 CR4: 0000000000160ef0 [ 4.407476] Call Trace: [ 4.407476] PANIC: double fault, error_code: 0x0 [ 4.407476] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.0.9-301.fc30.x86_64 #1 [ 4.407476] Hardware name: Red Hat Container-native virtualization/RHEL-AV, BIOS 1.13.0-1.module+el8.2.0+5520+4e5817f3 04/01/2014 [ 4.407476] RIP: 0010:error_entry+0xc/0x100 [ 4.407476] Code: 02 00 73 0c 65 48 0f b3 1c 25 d6 08 02 00 eb 05 49 0f ba ee 3f 41 0f 22 de e9 cb f7 ff ff fc 56 48 8b 74 24 08 48 89 7c 24 08 <52> 31 d2 51 31 c9 50 41 50 45 31 c0 41 51 45 31 c9 41 52 45 31 d2 [ 4.407476] RSP: 0000:fffffe0000006000 EFLAGS: 00010086 [ 4.407476] RAX: aaaaaaaaaaaaaaab RBX: 0000000000000000 RCX: 000000000000000c [ 4.407476] RDX: 0000000000000728 RSI: ffffffff99a00ffa RDI: fffffe0000006050 [ 4.407476] RBP: ffffffff994b6411 R08: ffffffff9996d010 R09: 0000000000000000 [ 4.407476] R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000000 [ 4.407476] R13: 0000000000000003 R14: ffff8b430318be80 R15: 0000000000000000 [ 4.407476] FS: 0000000000000000(0000) GS:ffff8b4303600000(0000) knlGS:0000000000000000 [ 4.407476] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.407476] CR2: fffffe0000005ff8 CR3: 000000000220e001 CR4: 0000000000160ef0 [ 4.407476] Call Trace: [ 4.407476] PANIC: double fault, error_code: 0x0 [ 4.407476] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.0.9-301.fc30.x86_64 #1 [ 4.407476] Hardware name: Red Hat Container-native virtualization/RHEL-AV, BIOS 1.13.0-1.module+el8.2.0+5520+4e5817f3 04/01/2014 [ 4.407476] RIP: 0010:error_entry+0xc/0x100 [ 4.407476] Code: 02 00 73 0c 65 48 0f b3 1c 25 d6 08 02 00 eb 05 49 0f ba ee 3f 41 0f 22 de e9 cb f7 ff ff fc 56 48 8b 74 24 08 48 89 7c 24 08 <52> 31 d2 51 31 c9 50 41 50 45 31 c0 41 51 45 31 c9 41 52 45 31 d2 You were disconnected from the console. This has one of the following reasons: - another user connected to the console of the target vm - network issues websocket: close 1006 (abnormal closure): unexpected EOF Version-Release number of selected component (if applicable): OCP 4.6.0-0.nightly-2020-06-09-081928 CNV 2.5.0-2 CDI 2.5.0-2 How reproducible: Sometimes Steps to Reproduce: 1. Create a source DV 2. Clone disk image from source DV to target DV 3. Create a VM with the target DV 4. Run the VM and access it $ virtctl console <vm> Actual results: 4. See description. Expected results: 4. Can log in to the fedora VM. Additional info: [cloud-user@ocp-psi-executor ~]$ oc -n test-hostpath get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES virt-launcher-fedora-vm-qklbf 1/1 Running 0 81s 10.131.0.95 verify-25-82nmv-worker-45xm4 <none> <none> [cloud-user@ocp-psi-executor ~]$ oc -n test-hostpath logs -f virt-launcher-fedora-vm-qklbf {"component":"virt-launcher","level":"info","msg":"Collected all requested hook sidecar sockets","pos":"manager.go:68","timestamp":"2020-06-22T07:47:05.662853Z"} {"component":"virt-launcher","level":"info","msg":"Sorted all collected sidecar sockets per hook point based on their priority and name: map[]","pos":"manager.go:71","timestamp":"2020-06-22T07:47:05.662992Z"} {"component":"virt-launcher","level":"info","msg":"Connecting to libvirt daemon: qemu:///system","pos":"libvirt.go:356","timestamp":"2020-06-22T07:47:05.683997Z"} {"component":"virt-launcher","level":"info","msg":"Connecting to libvirt daemon failed: virError(Code=38, Domain=7, Message='Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory')","pos":"libvirt.go:364","timestamp":"2020-06-22T07:47:05.684854Z"} {"component":"virt-launcher","level":"info","msg":"libvirt version: 6.0.0, package: 17.2.module+el8.2.0+6629+3fc0f2c2 (Red Hat, Inc. \u003chttp://bugzilla.redhat.com/bugzilla\u003e, 2020-05-14-11:03:02, )","subcomponent":"libvirt","thread":"45","timestamp":"2020-06-22T07:47:05.722000Z"} {"component":"virt-launcher","level":"info","msg":"hostname: fedora-vm","subcomponent":"libvirt","thread":"45","timestamp":"2020-06-22T07:47:05.722000Z"} {"component":"virt-launcher","level":"error","msg":"internal error: Child process (/usr/sbin/dmidecode -q -t 0,1,2,3,4,17) unexpected exit status 1: /dev/mem: No such file or directory","pos":"virCommandWait:2713","subcomponent":"libvirt","thread":"45","timestamp":"2020-06-22T07:47:05.722000Z"} {"component":"virt-launcher","level":"info","msg":"Connected to libvirt daemon","pos":"libvirt.go:372","timestamp":"2020-06-22T07:47:06.186985Z"} {"component":"virt-launcher","level":"info","msg":"Registered libvirt event notify callback","pos":"client.go:372","timestamp":"2020-06-22T07:47:06.190826Z"} {"component":"virt-launcher","level":"info","msg":"Marked as ready","pos":"virt-launcher.go:72","timestamp":"2020-06-22T07:47:06.191089Z"} {"component":"virt-launcher","kind":"","level":"info","msg":"Executing PreStartHook on VMI pod environment","name":"fedora-vm","namespace":"test-hostpath","pos":"manager.go:922","timestamp":"2020-06-22T07:47:09.561643Z","uid":"f71bc953-2dcd-400a-bb09-f06dc0d4d14f"} {"component":"virt-launcher","kind":"","level":"info","msg":"Starting PreCloudInitIso hook","name":"fedora-vm","namespace":"test-hostpath","pos":"manager.go:941","timestamp":"2020-06-22T07:47:09.561725Z","uid":"f71bc953-2dcd-400a-bb09-f06dc0d4d14f"} {"component":"virt-launcher","kind":"","level":"info","msg":"bridge pod interface: VIF: { Name: eth0, IP: 10.131.0.95, Mask: fffffe00, MAC: 0a:58:0a:83:00:5f, Gateway: 10.131.0.1, MTU: 1400, IPAMDisabled: false} \u0026{vmi:0xc000178800 vif:0xc001ae6fc0 iface:0xc001ab0460 virtIface:0xc0002b6f00 podNicLink:\u003cnil\u003e domain:0xc000343400 podInterfaceName:eth0 bridgeInterfaceName:k6t-eth0}","name":"fedora-vm","namespace":"test-hostpath","pos":"podinterface.go:317","timestamp":"2020-06-22T07:47:09.562377Z","uid":"f71bc953-2dcd-400a-bb09-f06dc0d4d14f"} {"component":"virt-launcher","level":"info","msg":"Found nameservers in /etc/resolv.conf: \ufffd\u001e\u0000\n","pos":"converter.go:1548","timestamp":"2020-06-22T07:47:09.562520Z"} {"component":"virt-launcher","level":"info","msg":"Found search domains in /etc/resolv.conf: test-hostpath.svc.cluster.local svc.cluster.local cluster.local verify-25.cnv-qe.rhcloud.com","pos":"converter.go:1549","timestamp":"2020-06-22T07:47:09.562560Z"} {"component":"virt-launcher","level":"info","msg":"Driver cache mode for /var/run/kubevirt-private/vmi-disks/dv-disk/disk.img set to none","pos":"converter.go:184","timestamp":"2020-06-22T07:47:09.562709Z"} {"component":"virt-launcher","level":"info","msg":"Starting SingleClientDHCPServer","pos":"dhcp.go:64","timestamp":"2020-06-22T07:47:09.563315Z"} {"component":"virt-launcher","kind":"","level":"info","msg":"Domain defined.","name":"fedora-vm","namespace":"test-hostpath","pos":"manager.go:1200","timestamp":"2020-06-22T07:47:09.994327Z","uid":"f71bc953-2dcd-400a-bb09-f06dc0d4d14f"} {"component":"virt-launcher","level":"info","msg":"DomainLifecycle event 0 with reason 0 received","pos":"client.go:337","timestamp":"2020-06-22T07:47:09.994797Z"} {"component":"virt-launcher","level":"warning","msg":"Failed to send audit message virt=kvm vm=\"test-hostpath_fedora-vm\" uuid=bb0bebea-15a4-55d1-8a61-c5e10c294b1d vm-ctx=+107:+107 img-ctx=+107:+107 model=dac: Permission denied","pos":"virAuditSend:141","subcomponent":"libvirt","thread":"33","timestamp":"2020-06-22T07:47:09.999000Z"} {"component":"virt-launcher","level":"warning","msg":"Failed to send audit message virt=kvm resrc=net reason=open vm=\"test-hostpath_fedora-vm\" uuid=bb0bebea-15a4-55d1-8a61-c5e10c294b1d net=0a:58:0a:83:00:5f path=\"/dev/net/tun\" rdev=0A:C8: Permission denied\ufffd\ufffd\ufffd\u0026","pos":"virAuditSend:141","subcomponent":"libvirt","thread":"33","timestamp":"2020-06-22T07:47:10.020000Z"} {"component":"virt-launcher","level":"warning","msg":"Failed to send audit message virt=kvm resrc=net reason=open vm=\"test-hostpath_fedora-vm\" uuid=bb0bebea-15a4-55d1-8a61-c5e10c294b1d net=0a:58:0a:83:00:5f path=\"/dev/vhost-net\" rdev=0A:EE: Permission denied","pos":"virAuditSend:141","subcomponent":"libvirt","thread":"33","timestamp":"2020-06-22T07:47:10.020000Z"} {"component":"virt-launcher","level":"error","msg":"At least one cgroup controller is required: No such device or address","pos":"virCgroupDetect:417","subcomponent":"libvirt","thread":"33","timestamp":"2020-06-22T07:47:10.031000Z"} {"component":"virt-launcher","level":"info","msg":"kubevirt domain status: Shutoff(5):Unknown(0)","pos":"client.go:231","timestamp":"2020-06-22T07:47:10.046121Z"} {"component":"virt-launcher","level":"info","msg":"7) OSINFO IN EVENT CALLBACK: \u003cnil\u003e","pos":"client.go:253","timestamp":"2020-06-22T07:47:10.050762Z"} {"component":"virt-launcher","level":"info","msg":"Detected domain with UUID bb0bebea-15a4-55d1-8a61-c5e10c294b1d","pos":"virt-launcher.go:213","timestamp":"2020-06-22T07:47:10.050880Z"} {"component":"virt-launcher","level":"info","msg":"Monitoring loop: rate 1s start timeout 5m0s","pos":"monitor.go:164","timestamp":"2020-06-22T07:47:10.050916Z"} {"component":"virt-launcher","level":"info","msg":"Domain name event: test-hostpath_fedora-vm","pos":"client.go:312","timestamp":"2020-06-22T07:47:10.051751Z"} {"component":"virt-launcher","level":"info","msg":"GuestAgentLifecycle event state 2 with reason 1 received","pos":"client.go:355","timestamp":"2020-06-22T07:47:10.113333Z"} {"component":"virt-launcher","level":"info","msg":"kubevirt domain status: Paused(3):StartingUp(11)","pos":"client.go:231","timestamp":"2020-06-22T07:47:10.118186Z"} {"component":"virt-launcher","level":"info","msg":"7) OSINFO IN EVENT CALLBACK: \u003cnil\u003e","pos":"client.go:253","timestamp":"2020-06-22T07:47:10.118237Z"} {"component":"virt-launcher","level":"info","msg":"Domain name event: test-hostpath_fedora-vm","pos":"client.go:312","timestamp":"2020-06-22T07:47:10.120100Z"} {"component":"virt-launcher","level":"warning","msg":"Failed to send audit message virt=kvm resrc=disk reason=start vm=\"test-hostpath_fedora-vm\" uuid=bb0bebea-15a4-55d1-8a61-c5e10c294b1d old-disk=\"?\" new-disk=\"/var/run/kubevirt-private/vmi-disks/dv-disk/disk.img\": Permission denied\ufffd\ufffd\ufffd\u0026","pos":"virAuditSend:141","subcomponent":"libvirt","thread":"33","timestamp":"2020-06-22T07:47:10.563000Z"} {"component":"virt-launcher","level":"warning","msg":"Failed to send audit message virt=kvm resrc=net reason=start vm=\"test-hostpath_fedora-vm\" uuid=bb0bebea-15a4-55d1-8a61-c5e10c294b1d old-net=\"?\" new-net=\"0a:58:0a:83:00:5f\": Permission denied\ufffdC\ufffd\u0026","pos":"virAuditSend:141","subcomponent":"libvirt","thread":"33","timestamp":"2020-06-22T07:47:10.563000Z"} {"component":"virt-launcher","level":"warning","msg":"Failed to send audit message virt=kvm resrc=chardev reason=start vm=\"test-hostpath_fedora-vm\" uuid=bb0bebea-15a4-55d1-8a61-c5e10c294b1d old-chardev=\"?\" new-chardev=\"/var/run/kubevirt-private/f71bc953-2dcd-400a-bb09-f06dc0d4d14f/virt-serial0\": Permission denied\ufffd\ufffd\ufffd\u0026","pos":"virAuditSend:141","subcomponent":"libvirt","thread":"33","timestamp":"2020-06-22T07:47:10.563000Z"} {"component":"virt-launcher","level":"warning","msg":"Failed to send audit message virt=kvm resrc=chardev reason=start vm=\"test-hostpath_fedora-vm\" uuid=bb0bebea-15a4-55d1-8a61-c5e10c294b1d old-chardev=\"?\" new-chardev=\"/var/lib/libvirt/qemu/channel/target/domain-1-test-hostpath_fedora/org.qemu.guest_agent.0\": Permission denied\ufffd\ufffd\ufffd\u0026","pos":"virAuditSend:141","subcomponent":"libvirt","thread":"33","timestamp":"2020-06-22T07:47:10.563000Z"} {"component":"virt-launcher","level":"warning","msg":"Failed to send audit message virt=kvm resrc=rng reason=start vm=\"test-hostpath_fedora-vm\" uuid=bb0bebea-15a4-55d1-8a61-c5e10c294b1d old-rng=\"?\" new-rng=\"/dev/urandom\": Permission denied\ufffd\ufffd\ufffd\u0026","pos":"virAuditSend:141","subcomponent":"libvirt","thread":"33","timestamp":"2020-06-22T07:47:10.563000Z"} {"component":"virt-launcher","level":"warning","msg":"Failed to send audit message virt=kvm resrc=mem reason=start vm=\"test-hostpath_fedora-vm\" uuid=bb0bebea-15a4-55d1-8a61-c5e10c294b1d old-mem=0 new-mem=63488: Permission denied\ufffd\ufffd\ufffd\u0026","pos":"virAuditSend:141","subcomponent":"libvirt","thread":"33","timestamp":"2020-06-22T07:47:10.563000Z"} {"component":"virt-launcher","level":"warning","msg":"Failed to send audit message virt=kvm resrc=vcpu reason=start vm=\"test-hostpath_fedora-vm\" uuid=bb0bebea-15a4-55d1-8a61-c5e10c294b1d old-vcpu=0 new-vcpu=1: Permission denied\ufffd\ufffd\ufffd\u0026","pos":"virAuditSend:141","subcomponent":"libvirt","thread":"33","timestamp":"2020-06-22T07:47:10.564000Z"} {"component":"virt-launcher","level":"warning","msg":"Failed to send audit message virt=kvm resrc=iothread reason=start vm=\"test-hostpath_fedora-vm\" uuid=bb0bebea-15a4-55d1-8a61-c5e10c294b1d old-iothread=0 new-iothread=1: Permission denied\ufffd\ufffd\ufffd\u0026","pos":"virAuditSend:141","subcomponent":"libvirt","thread":"33","timestamp":"2020-06-22T07:47:10.564000Z"} {"component":"virt-launcher","level":"warning","msg":"Failed to send audit message virt=kvm op=start reason=booted vm=\"test-hostpath_fedora-vm\" uuid=bb0bebea-15a4-55d1-8a61-c5e10c294b1d vm-pid=74: Permission denied\ufffd@\ufffd\u0026","pos":"virAuditSend:141","subcomponent":"libvirt","thread":"33","timestamp":"2020-06-22T07:47:10.564000Z"} {"component":"virt-launcher","level":"info","msg":"DomainLifecycle event 4 with reason 0 received","pos":"client.go:337","timestamp":"2020-06-22T07:47:10.564458Z"} {"component":"virt-launcher","level":"info","msg":"DomainLifecycle event 2 with reason 0 received","pos":"client.go:337","timestamp":"2020-06-22T07:47:10.564651Z"} {"component":"virt-launcher","kind":"","level":"info","msg":"Domain started.","name":"fedora-vm","namespace":"test-hostpath","pos":"manager.go:1235","timestamp":"2020-06-22T07:47:10.565476Z","uid":"f71bc953-2dcd-400a-bb09-f06dc0d4d14f"} {"component":"virt-launcher","kind":"","level":"info","msg":"Synced vmi","name":"fedora-vm","namespace":"test-hostpath","pos":"server.go:167","timestamp":"2020-06-22T07:47:10.567140Z","uid":"f71bc953-2dcd-400a-bb09-f06dc0d4d14f"} {"component":"virt-launcher","level":"info","msg":"kubevirt domain status: Running(1):Unknown(1)","pos":"client.go:231","timestamp":"2020-06-22T07:47:10.568420Z"} {"component":"virt-launcher","level":"info","msg":"7) OSINFO IN EVENT CALLBACK: \u003cnil\u003e","pos":"client.go:253","timestamp":"2020-06-22T07:47:10.568479Z"} {"component":"virt-launcher","level":"info","msg":"Domain name event: test-hostpath_fedora-vm","pos":"client.go:312","timestamp":"2020-06-22T07:47:10.571432Z"} {"component":"virt-launcher","level":"info","msg":"kubevirt domain status: Running(1):Unknown(1)","pos":"client.go:231","timestamp":"2020-06-22T07:47:10.573070Z"} {"component":"virt-launcher","level":"info","msg":"7) OSINFO IN EVENT CALLBACK: \u003cnil\u003e","pos":"client.go:253","timestamp":"2020-06-22T07:47:10.573109Z"} {"component":"virt-launcher","level":"info","msg":"Domain name event: test-hostpath_fedora-vm","pos":"client.go:312","timestamp":"2020-06-22T07:47:10.574422Z"} {"component":"virt-launcher","kind":"","level":"info","msg":"Synced vmi","name":"fedora-vm","namespace":"test-hostpath","pos":"server.go:167","timestamp":"2020-06-22T07:47:10.661584Z","uid":"f71bc953-2dcd-400a-bb09-f06dc0d4d14f"} {"component":"virt-launcher","kind":"","level":"info","msg":"Synced vmi","name":"fedora-vm","namespace":"test-hostpath","pos":"server.go:167","timestamp":"2020-06-22T07:47:10.703675Z","uid":"f71bc953-2dcd-400a-bb09-f06dc0d4d14f"} {"component":"virt-launcher","level":"info","msg":"Found PID for bb0bebea-15a4-55d1-8a61-c5e10c294b1d: 74","pos":"monitor.go:133","timestamp":"2020-06-22T07:47:11.053726Z"} On the node where disk image is sh-4.4# ls -lsh total 892M 892M -rw-r--r--. 1 qemu qemu 10G Jun 22 07:47 disk.img [cloud-user@ocp-psi-executor ~]$ oc -n test-hostpath get vm NAME AGE RUNNING VOLUME fedora-vm 12s true [cloud-user@ocp-psi-executor ~]$ oc -n test-hostpath get vm -o yaml apiVersion: v1 items: - apiVersion: kubevirt.io/v1alpha3 kind: VirtualMachine metadata: annotations: kubevirt.io/latest-observed-api-version: v1alpha3 kubevirt.io/storage-observed-api-version: v1alpha3 creationTimestamp: "2020-06-22T08:00:14Z" finalizers: - k8s.v1.cni.cncf.io/kubeMacPool generation: 4 managedFields: - apiVersion: kubevirt.io/v1alpha3 fieldsType: FieldsV1 fieldsV1: f:spec: .: {} f:template: .: {} f:metadata: .: {} f:labels: .: {} f:kubevirt.io/domain: {} f:kubevirt.io/vm: {} f:spec: .: {} f:domain: .: {} f:devices: .: {} f:disks: {} f:rng: {} f:resources: .: {} f:requests: .: {} f:memory: {} f:nodeSelector: .: {} f:kubernetes.io/hostname: {} f:volumes: {} manager: Swagger-Codegen operation: Update time: "2020-06-22T08:00:14Z" - apiVersion: kubevirt.io/v1alpha3 fieldsType: FieldsV1 fieldsV1: f:spec: f:running: {} manager: virt-api operation: Update time: "2020-06-22T08:00:14Z" - apiVersion: kubevirt.io/v1alpha3 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:kubevirt.io/latest-observed-api-version: {} f:kubevirt.io/storage-observed-api-version: {} f:status: f:conditions: {} f:created: {} f:ready: {} manager: virt-controller operation: Update time: "2020-06-22T08:00:24Z" name: fedora-vm namespace: test-hostpath resourceVersion: "9427870" selfLink: /apis/kubevirt.io/v1alpha3/namespaces/test-hostpath/virtualmachines/fedora-vm uid: 9c1930cc-ee6f-4712-b74f-184c592edecc spec: running: true template: metadata: creationTimestamp: null labels: kubevirt.io/domain: fedora-vm kubevirt.io/vm: fedora-vm spec: domain: devices: disks: - disk: bus: virtio name: dv-disk rng: {} machine: type: pc-q35-rhel8.2.0 resources: requests: memory: 64M nodeSelector: kubernetes.io/hostname: verify-25-82nmv-worker-45xm4 volumes: - dataVolume: name: cnv-3516-target-dv name: dv-disk status: conditions: - lastProbeTime: null lastTransitionTime: "2020-06-22T08:00:22Z" status: "True" type: Ready created: true ready: true kind: List metadata: resourceVersion: "" selfLink: "" [cloud-user@ocp-psi-executor ~]$ oc -n test-hostpath get vmi -o yaml apiVersion: v1 items: - apiVersion: kubevirt.io/v1alpha3 kind: VirtualMachineInstance metadata: annotations: kubevirt.io/latest-observed-api-version: v1alpha3 kubevirt.io/storage-observed-api-version: v1alpha3 creationTimestamp: "2020-06-22T08:00:14Z" finalizers: - foregroundDeleteVirtualMachine generateName: fedora-vm generation: 8 labels: kubevirt.io/domain: fedora-vm kubevirt.io/nodeName: verify-25-82nmv-worker-45xm4 kubevirt.io/vm: fedora-vm managedFields: - apiVersion: kubevirt.io/v1alpha3 fieldsType: FieldsV1 fieldsV1: f:metadata: f:annotations: .: {} f:kubevirt.io/latest-observed-api-version: {} f:kubevirt.io/storage-observed-api-version: {} f:generateName: {} f:labels: .: {} f:kubevirt.io/domain: {} f:kubevirt.io/nodeName: {} f:kubevirt.io/vm: {} f:ownerReferences: {} f:spec: .: {} f:domain: .: {} f:devices: .: {} f:disks: {} f:rng: {} f:firmware: .: {} f:uuid: {} f:machine: .: {} f:type: {} f:resources: .: {} f:requests: .: {} f:memory: {} f:nodeSelector: .: {} f:kubernetes.io/hostname: {} f:volumes: {} f:status: .: {} f:activePods: .: {} f:00d60d74-0a69-4f91-9d61-cc336781132e: {} f:conditions: {} f:guestOSInfo: {} f:nodeName: {} f:qosClass: {} manager: virt-controller operation: Update time: "2020-06-22T08:00:24Z" - apiVersion: kubevirt.io/v1alpha3 fieldsType: FieldsV1 fieldsV1: f:status: f:interfaces: {} f:migrationMethod: {} f:phase: {} manager: virt-handler operation: Update time: "2020-06-22T08:00:24Z" name: fedora-vm namespace: test-hostpath ownerReferences: - apiVersion: kubevirt.io/v1alpha3 blockOwnerDeletion: true controller: true kind: VirtualMachine name: fedora-vm uid: 9c1930cc-ee6f-4712-b74f-184c592edecc resourceVersion: "9427869" selfLink: /apis/kubevirt.io/v1alpha3/namespaces/test-hostpath/virtualmachineinstances/fedora-vm uid: a0595468-9f3b-4476-95a0-cac47f4ebace spec: domain: devices: disks: - disk: bus: virtio name: dv-disk interfaces: - bridge: {} name: default rng: {} features: acpi: enabled: true firmware: uuid: bb0bebea-15a4-55d1-8a61-c5e10c294b1d machine: type: pc-q35-rhel8.2.0 resources: requests: cpu: 100m memory: 64M networks: - name: default pod: {} nodeSelector: kubernetes.io/hostname: verify-25-82nmv-worker-45xm4 volumes: - dataVolume: name: cnv-3516-target-dv name: dv-disk status: activePods: 00d60d74-0a69-4f91-9d61-cc336781132e: verify-25-82nmv-worker-45xm4 conditions: - lastProbeTime: null lastTransitionTime: null message: cannot migrate VMI with non-shared PVCs reason: DisksNotLiveMigratable status: "False" type: LiveMigratable - lastProbeTime: null lastTransitionTime: null message: cannot migrate VMI which does not use masquerade to connect to the pod network reason: InterfaceNotLiveMigratable status: "False" type: LiveMigratable - lastProbeTime: null lastTransitionTime: "2020-06-22T08:00:22Z" status: "True" type: Ready guestOSInfo: {} interfaces: - ipAddress: 10.131.0.99 mac: 0a:58:0a:83:00:63 name: default migrationMethod: BlockMigration nodeName: verify-25-82nmv-worker-45xm4 phase: Running qosClass: Burstable kind: List metadata: resourceVersion: "" selfLink: ""
Can you please confirm that you were using an 8.2 RHCOS?
Adam, Since DVs were observed as a possible root cause, I'm wondering if your team could help with an initial investigation/triage. Storage team might be able to provide valuable insights.
(In reply to Fabian Deutsch from comment #1) > Can you please confirm that you were using an 8.2 RHCOS? It is an cluster with OCP 4.6.0-0.nightly-2020-06-09-081928, RHCOS 4.5, el8.2 packages: [core@verify-25-82nmv-master-1 ~]$ cat /etc/os-release NAME="Red Hat Enterprise Linux CoreOS" VERSION="4.5" VERSION_ID="4.5" OPENSHIFT_VERSION="4.5" RHEL_VERSION="8.2" PRETTY_NAME="Red Hat Enterprise Linux CoreOS 4.5 (Ootpa)" ID="rhcos" ID_LIKE="rhel fedora" ANSI_COLOR="0;31" HOME_URL="https://www.redhat.com/" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform" REDHAT_BUGZILLA_PRODUCT_VERSION="4.5" REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform" REDHAT_SUPPORT_PRODUCT_VERSION="4.5" OSTREE_VERSION='46.81.202006031446-0' It can also be reproduced on a cluster with OCP 4.5.0-rc.2, RHCOS 4.5, el8.2: [core@storage-mw8pp-master-0 ~]$ cat /etc/os-release NAME="Red Hat Enterprise Linux CoreOS" VERSION="45.82.202006190229-0" VERSION_ID="4.5" OPENSHIFT_VERSION="4.5" RHEL_VERSION="8.2" PRETTY_NAME="Red Hat Enterprise Linux CoreOS 45.82.202006190229-0 (Ootpa)" ID="rhcos" ID_LIKE="rhel fedora" ANSI_COLOR="0;31" HOME_URL="https://www.redhat.com/" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="OpenShift Container Platform" REDHAT_BUGZILLA_PRODUCT_VERSION="4.5" REDHAT_SUPPORT_PRODUCT="OpenShift Container Platform" REDHAT_SUPPORT_PRODUCT_VERSION="4.5" OSTREE_VERSION='45.82.202006190229-0'
This could be related to 1845901. We are investigating some fs corruption issues where cloning commences before the data on the source PVC has settled.
Qixuan, assuming that Adam's assessment is correct this would essentially mean that the DV hadn't yet successfully quiesced when the VM was started. Thus the observed panic is essentially filesystem corruption. Would you be able to repeat the steps to reproduce, only ensure that the clone process was really completed before starting a VM?
Actually, my instructions in Comment #5 are slightly incorrect. It's important to ensure the source disk has been fully populated before starting the clone operation. This will help us determine if it's a dup of https://bugzilla.redhat.com/show_bug.cgi?id=1845901 Also, to ensure you have a correct environment, can you please repeat these steps on a released version of OCP 4.5.x instead of a nightly snapshot?
Can you run an md5sum on the source disk.img and the target disk.img? Does the source boot?
Tested with OCP 4.5.0-rc.4, CNV v2.4.0-52 and CDI v2.4.0-25 I can NOT reproduce the bug when I did the whole steps manually. The content, size, user/group of the source image and target(cloned) image are the same, and VM can boot with source DV or target DV and then be accessed normally. Source DV [cloud-user@ocp-psi-executor qwang]$ oc debug node/gouyang0701-7ppwh-worker-45nvt -t Starting pod/gouyang0701-7ppwh-worker-45nvt-debug ... To use host binaries, run `chroot /host` Pod IP: 192.168.3.67 If you don't see a command prompt, try pressing enter. sh-4.2# ls -lsh /host/var/hpvolumes/pvc-d0612139-757e-4d09-9c97-66295d6f451f/disk.img 892M -rw-r--r--. 1 root root 10G Jul 2 14:47 /host/var/hpvolumes/pvc-d0612139-757e-4d09-9c97-66295d6f451f/disk.img sh-4.2# md5sum /host/var/hpvolumes/pvc-d0612139-757e-4d09-9c97-66295d6f451f/disk.img 995bc410a630f08a8b525ab7b685f783 /host/var/hpvolumes/pvc-d0612139-757e-4d09-9c97-66295d6f451f/disk.img Target DV [cloud-user@ocp-psi-executor qwang]$ oc debug node/gouyang0701-7ppwh-worker-7dp8c -t Starting pod/gouyang0701-7ppwh-worker-7dp8c-debug ... To use host binaries, run `chroot /host` Pod IP: 192.168.1.69 If you don't see a command prompt, try pressing enter. sh-4.2# ls -lsh /host/var/hpvolumes/pvc-16dedc5d-7f26-442e-abfa-cc9038cf2fac/disk.img 892M -rw-r--r--. 1 root root 10G Jul 2 14:47 /host/var/hpvolumes/pvc-16dedc5d-7f26-442e-abfa-cc9038cf2fac/disk.img sh-4.2# md5sum /host/var/hpvolumes/pvc-16dedc5d-7f26-442e-abfa-cc9038cf2fac/disk.img 995bc410a630f08a8b525ab7b685f783 /host/var/hpvolumes/pvc-16dedc5d-7f26-442e-abfa-cc9038cf2fac/disk.img ============================== I can reproduce the bug during Tier-2 automation ran every time. The content of source image and cloned image are different meanwhile the size is the same. The user/group of source DV is root, while target DV is qemu(107). VM can boot and be accessed normally with the source image, but can't be accessed with the cloned image although the cloned DV showed 100% complete. Here are the details: [cloud-user@ocp-psi-executor qwang]$ oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE cnv-3516-source-dv Bound pvc-545d4b33-9897-4f3c-8975-ea63442f7b79 48Gi RWO hostpath-provisioner 95s cnv-3516-target-dv Bound pvc-a2e95960-ac9a-4959-bf77-0e2c6e614619 48Gi RWO hostpath-provisioner 53s [cloud-user@ocp-psi-executor qwang]$ oc get dv NAME PHASE PROGRESS RESTARTS AGE cnv-3516-source-dv Succeeded 100.0% 0 3m55s cnv-3516-target-dv Succeeded 100.0% 0 3m13s Source DV. The md5sum result was 995bc410a630f08a8b525ab7b685f783 each time. [cloud-user@ocp-psi-executor qwang]$ oc debug node/gouyang0701-7ppwh-worker-45nvt -t Starting pod/gouyang0701-7ppwh-worker-45nvt-debug ... To use host binaries, run `chroot /host` Pod IP: 192.168.3.67 If you don't see a command prompt, try pressing enter. sh-4.2# ls -ls /host/var/hpvolumes/pvc-545d4b33-9897-4f3c-8975-ea63442f7b79/disk.img 913032 -rw-r--r--. 1 root root 10737418240 Jul 2 15:37 /host/var/hpvolumes/pvc-545d4b33-9897-4f3c-8975-ea63442f7b79/disk.img sh-4.2# md5sum /host/var/hpvolumes/pvc-545d4b33-9897-4f3c-8975-ea63442f7b79/disk.img 995bc410a630f08a8b525ab7b685f783 /host/var/hpvolumes/pvc-545d4b33-9897-4f3c-8975-ea63442f7b79/disk.img Target DV. The md5sum result was fafb16b77c764a8ad0786bf86264741c each time. [cloud-user@ocp-psi-executor qwang]$ oc debug node/gouyang0701-7ppwh-worker-45nvt -t Starting pod/gouyang0701-7ppwh-worker-45nvt-debug ... To use host binaries, run `chroot /host` Pod IP: 192.168.3.67 If you don't see a command prompt, try pressing enter. sh-4.2# ls -ls /host/var/hpvolumes/pvc-a2e95960-ac9a-4959-bf77-0e2c6e614619/disk.img 913932 -rw-r--r--. 1 107 107 10737418240 Jul 2 15:39 /host/var/hpvolumes/pvc-a2e95960-ac9a-4959-bf77-0e2c6e614619/disk.img sh-4.2# md5sum /host/var/hpvolumes/pvc-a2e95960-ac9a-4959-bf77-0e2c6e614619/disk.img fafb16b77c764a8ad0786bf86264741c /host/var/hpvolumes/pvc-a2e95960-ac9a-4959-bf77-0e2c6e614619/disk.img Hope that helps.
As per comment #9, There is a consistent correlation between cases where the md5sums between source and cloned DV don't match and there being problems with the running VM. Thus I believe this is a storage issue (possibly even a dup of 1845901). Re-assigning to Storage team for triage.
Marking as Not a bug because the VM did not have enough memory defined. The mismatching md5 issue brought up is being addressed in Bug 1845901.