Bug 1867122

Summary:

If importing VM disks from URL takes more than 10 minutes, VMI get destroyed and recreated generating noise in user facing events

Product:

Container Native Virtualization (CNV)

Reporter:

Simone Tiraboschi <stirabos>

Component:

Virtualization

Assignee:

sgott

Status:

CLOSED WORKSFORME

QA Contact:

Israel Pinto <ipinto>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

2.4.0

CC:

cnv-qe-bugs, fdeutsch, ipinto, kbidarka, mrashish, ncredi, rmohr, sgott

Target Milestone:

---

Target Release:

future

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2021-02-17 13:12:38 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
VMI killed and recreated	none

Description Simone Tiraboschi 2020-08-07 11:57:47 UTC

Description of problem:
I was trying to reproduce https://bugzilla.redhat.com/1862701 exactly as for the attached screenshots.

In my case, for some strange reason (a wrong/overloaded mirror???), downloading the disk image from https://dl.fedoraproject.org/pub/fedora/linux/releases/32/Cloud/x86_64/images/Fedora-Cloud-Base-32-1.6.x86_64.qcow2 took more than 10 minutes:

 NAME                                            PHASE              PROGRESS   RESTARTS   AGE
 datavolume.cdi.kubevirt.io/fedora-vm-rootdisk   ImportInProgress   86.25%     0          10m


So, after 10 minutes, a readiness probe error got triggered on virt-launcher pod and this caused its VMI object to be destroyed and recreated.

A the end the VM successfully started as expected but we have a lot of noise in user facing events.
See the attached screenshot.

I also got VMI related error events with:
(combined from similar events): server error. command SyncVMI failed: "LibvirtError(Code=1, Domain=10, Message='internal error: process exited while connecting to monitor: 2020-08-07T10:23:03.895236Z qemu-kvm: -blockdev {\"driver\":\"file\",\"filename\":\"/var/run/kubevirt-private/vmi-disks/rootdisk/disk.img\",\"node-name\":\"libvirt-2-storage\",\"auto-read-only\":true,\"discard\":\"unmap\"}: Could not open '/var/run/kubevirt-private/vmi-disks/rootdisk/disk.img': Permission denied')"


Version-Release number of selected component (if applicable):


How reproducible:
100%


Steps to Reproduce:
1. try to start a VM from UI wizard specifying a URL source
2. ensure that CDI takes more than 10 minutes to download the source disk
3.

Actual results:
- many VMI related error events
- VMI got deleted and recreated after 10 minutes

Expected results:
No false negative events if the download is progressing

Additional info:

Comment 1 Simone Tiraboschi 2020-08-07 11:58:36 UTC

Created attachment 1710792 [details]
VMI killed and recreated

Comment 2 Nelly Credi 2020-09-30 12:47:50 UTC

@Israel, @Stu, should this bug be on Virt?
(removing useless events from the log vs making them disappear in the UI)

Comment 4 sgott 2020-10-21 12:45:02 UTC

Moving this BZ to virtualization for proper tracking. This isn't a UX BZ

Comment 7 Roman Mohr 2020-12-02 13:32:35 UTC

If this issue can be reproduced, here is what should happen, to provide hints where to look for a fix:

 1) We check if all datavolumes are imported
 2) if they are not imported, we don't create a pod
 3) once all DVs are done with the import, we create the pod

Comment 8 Kedar Bidarkar 2021-01-27 13:31:40 UTC

Try to reproduce this with CNV-2.6.0

Comment 9 Kedar Bidarkar 2021-01-29 22:38:17 UTC

Summary: Was unable to reproduce this issue on CNV-2.6.0 (cnv/virt-operator/v2.6.0-106)

1) Created a VM( on a cluster in US) from the UI Wizard, using a link from (EMEA)
2) Download took more than 10 mins and no false negative events observed anymore.

Attaching a screenshot shortly.

Comment 12 Kedar Bidarkar 2021-02-17 13:13:34 UTC

Cannot reproduce in the current release CNV-2.6.0