Description of problem: We're attempting to spin up a VM backed by a DataVolume. If that DataVolume is set with volumeMode: block and the source is blank, the PVC and DV get created, but no VM pod ever spins up. Version-Release number of selected component (if applicable): hyperconverged-cluster-operator:v2.2.0-12 virt-cdi-operator:v2.2.0-5 How reproducible: Always Steps to Reproduce: 1. Create a VM template with a DV, with volumeMode: block and source: blank: {} 2. oc create -f <vm template> Actual results: PVC is created DV is created, but with status: status: phase: PVCBound rather than: status: phase: Succeeded progress: 100.0% VM is created and says it is running, but no VMI/pod exist: # oc get vm/dc-vm-2 NAME AGE RUNNING VOLUME dc-vm-2 7m12s true Expected results: DV 'succeeds' and the VM starts Additional info: debugging info requested in email: # cat dv-test.yml apiVersion: cdi.kubevirt.io/v1alpha1 kind: DataVolume metadata: name: dv-block-blank spec: pvc: accessModes: - ReadWriteMany dataSource: null resources: requests: storage: 10Gi storageClassName: ocs-storagecluster-ceph-rbd volumeMode: Block source: blank: {} # oc create -f dv-test.yml datavolume.cdi.kubevirt.io/dv-block-blank created # oc get dv/dv-block-blank -o yaml apiVersion: cdi.kubevirt.io/v1alpha1 kind: DataVolume metadata: creationTimestamp: "2020-04-29T00:08:02Z" generation: 3 name: dv-block-blank namespace: ocs-cnv resourceVersion: "26471071" selfLink: /apis/cdi.kubevirt.io/v1alpha1/namespaces/ocs-cnv/datavolumes/dv-block-blank uid: 009892d3-1cdd-4f14-a7c0-a13f0abd9a34 spec: pvc: accessModes: - ReadWriteMany dataSource: null resources: requests: storage: 10Gi storageClassName: ocs-storagecluster-ceph-rbd volumeMode: Block source: blank: {} status: phase: PVCBound # oc get pvc/dv-block-blank -o yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: cdi.kubevirt.io/storage.contentType: kubevirt cdi.kubevirt.io/storage.import.source: none cdi.kubevirt.io/storage.pod.phase: Succeeded pv.kubernetes.io/bind-completed: "yes" pv.kubernetes.io/bound-by-controller: "yes" volume.beta.kubernetes.io/storage-provisioner: openshift-storage.rbd.csi.ceph.com creationTimestamp: "2020-04-29T00:08:02Z" finalizers: - kubernetes.io/pvc-protection labels: app: containerized-data-importer cdi-controller: dv-block-blank name: dv-block-blank namespace: ocs-cnv ownerReferences: - apiVersion: cdi.kubevirt.io/v1alpha1 blockOwnerDeletion: true controller: true kind: DataVolume name: dv-block-blank uid: 009892d3-1cdd-4f14-a7c0-a13f0abd9a34 resourceVersion: "26471082" selfLink: /api/v1/namespaces/ocs-cnv/persistentvolumeclaims/dv-block-blank uid: 096fff16-acf3-424e-9012-4bcc89a754c9 spec: accessModes: - ReadWriteMany resources: requests: storage: 10Gi storageClassName: ocs-storagecluster-ceph-rbd volumeMode: Block volumeName: pvc-096fff16-acf3-424e-9012-4bcc89a754c9 status: accessModes: - ReadWriteMany capacity: storage: 10Gi phase: Bound # oc get dv/dv-block-blank -o yaml apiVersion: cdi.kubevirt.io/v1alpha1 kind: DataVolume metadata: creationTimestamp: "2020-04-29T00:22:38Z" generation: 3 name: dv-block-blank namespace: ocs-cnv resourceVersion: "26479912" selfLink: /apis/cdi.kubevirt.io/v1alpha1/namespaces/ocs-cnv/datavolumes/dv-block-blank uid: 8305760e-ef32-4c7a-9b3c-82dc40990949 spec: pvc: accessModes: - ReadWriteOnce dataSource: null resources: requests: storage: 10Gi storageClassName: ocs-storagecluster-ceph-rbd volumeMode: Block source: blank: {} status: phase: Succeeded progress: 100.0%
This is a bug in CDI, it is not always updating the phase of the DV to be successful, which is causing KubeVirt to not spin up a VMI. The VM will have running set to true, which normally would create the VMI, but due to the DV not being in succeeded causes the VMI to not get created.
Workaround: Creating a blank disk on a block volume is a no-op for CDI since the empty block device can be used directly by qemu. Therefore, as a workaround for this problem you can directly use a PVC as the disk in your VM spec. For example: apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-blank-disk spec: accessModes: - ReadWriteMany resources: requests: storage: 10Gi storageClassName: ocs-storagecluster-ceph-rbd volumeMode: Block and use it in your VM like so: ... spec: domain: devices: disks: - disk: bus: virtio name: blankdisk ... volumes: - name: blankdisk persistentVolumeClaim: claimName: pvc-blank-disk ...
Confirmed, the workaround works. Thanks!
Merged fix upstream
Verified with the following code: ------------------------------------------ oc version Client Version: 4.5.0-rc.1 Server Version: 4.5.0-rc.1 Kubernetes Version: v1.18.3+a637491 oc get csv --all-namespaces |grep cnv openshift-cnv kubevirt-hyperconverged-operator.v2.4.0 OpenShift virtualization 2.4.0 Succeeded Verified with the following scenario: ------------------------------------------ Created a datavolume PVC is bound DV is progress 100.0% Moving to VERIFIED! dv.yaml ----------------------------------------- apiVersion: cdi.kubevirt.io/v1alpha1 kind: DataVolume metadata: name: dv-block-blank spec: pvc: accessModes: - ReadWriteMany dataSource: null resources: requests: storage: 10Gi storageClassName: ocs-storagecluster-ceph-rbd volumeMode: Block source: blank: {} oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE dv-block-blank Bound pvc-2941a722-1a04-4bfa-b559-e80fbd37999e 10Gi RWX ocs-storagecluster-ceph-rbd 158m oc get dv NAME PHASE PROGRESS RESTARTS AGE dv-block-blank Succeeded 100.0% 0 158m oc describe dv dv-block-blank Name: dv-block-blank Namespace: default Labels: <none> Annotations: <none> API Version: cdi.kubevirt.io/v1alpha1 Kind: DataVolume Metadata: Creation Timestamp: 2020-06-25T15:29:59Z Generation: 6 Managed Fields: API Version: cdi.kubevirt.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:spec: .: f:pvc: .: f:accessModes: f:resources: .: f:requests: .: f:storage: f:storageClassName: f:volumeMode: f:source: .: f:blank: Manager: oc Operation: Update Time: 2020-06-25T15:29:59Z API Version: cdi.kubevirt.io/v1alpha1 Fields Type: FieldsV1 fieldsV1: f:status: .: f:conditions: f:phase: f:progress: f:restartCount: Manager: virt-cdi-controller Operation: Update Time: 2020-06-25T15:30:12Z Resource Version: 4541247 Self Link: /apis/cdi.kubevirt.io/v1alpha1/namespaces/default/datavolumes/dv-block-blank UID: cdb5083b-1cf5-423e-bc4a-f4471c59ab08 Spec: Pvc: Access Modes: ReadWriteMany Resources: Requests: Storage: 10Gi Storage Class Name: ocs-storagecluster-ceph-rbd Volume Mode: Block Source: Blank: Status: Conditions: Last Heart Beat Time: 2020-06-25T15:30:12Z Last Transition Time: 2020-06-25T15:30:12Z Message: PVC dv-block-blank Bound Reason: Bound Status: True Type: Bound Last Heart Beat Time: 2020-06-25T15:30:11Z Last Transition Time: 2020-06-25T15:30:11Z Status: True Type: Ready Last Heart Beat Time: 2020-06-25T15:29:59Z Last Transition Time: 2020-06-25T15:29:59Z Status: False Type: Running Phase: Succeeded Progress: 100.0% Restart Count: 0 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Pending 157m datavolume-controller PVC dv-block-blank Pending Normal ImportSucceeded 157m datavolume-controller Successfully imported into PVC dv-block-blank Normal Bound 156m datavolume-controller PVC dv-block-blank Bound oc describe pvc dv- dv-block-blank dv-cnv-2354 [cnv-qe-jenkins@cnv-executor-dafrank ~]$ oc get ovc error: the server doesn't have a resource type "ovc" [cnv-qe-jenkins@cnv-executor-dafrank ~]$ oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE dv-block-blank Bound pvc-2941a722-1a04-4bfa-b559-e80fbd37999e 10Gi RWX ocs-storagecluster-ceph-rbd 157m dv-cnv-2354 Bound pvc-dc2324a4-6319-49cb-bd27-18af0ebfb606 5Gi RWX ocs-storagecluster-ceph-rbd 8h [cnv-qe-jenkins@cnv-executor-dafrank ~]$ oc describe pvc dv-block-blank Name: dv-block-blank Namespace: default StorageClass: ocs-storagecluster-ceph-rbd Status: Bound Volume: pvc-2941a722-1a04-4bfa-b559-e80fbd37999e Labels: app=containerized-data-importer Annotations: cdi.kubevirt.io/storage.contentType: kubevirt cdi.kubevirt.io/storage.import.source: none cdi.kubevirt.io/storage.pod.phase: Succeeded cdi.kubevirt.io/storage.pod.restarts: 0 pv.kubernetes.io/bind-completed: yes pv.kubernetes.io/bound-by-controller: yes volume.beta.kubernetes.io/storage-provisioner: openshift-storage.rbd.csi.ceph.com Finalizers: [kubernetes.io/pvc-protection] Capacity: 10Gi Access Modes: RWX VolumeMode: Block Mounted By: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal ExternalProvisioning 157m persistentvolume-controller waiting for a volume to be created, either by external provisioner "openshift-storage.rbd.csi.ceph.com" or manually created by system administrator Normal Provisioning 157m openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-669d77666d-mxm4l_f3789bed-de7d-4fd8-a9e4-6e38c8d93496 External provisioner is provisioning volume for claim "default/dv-block-blank" Normal ProvisioningSucceeded 157m openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-669d77666d-mxm4l_f3789bed-de7d-4fd8-a9e4-6e38c8d93496 Successfully provisioned volume pvc-2941a722-1a04-4bfa-b559-e80fbd37999e
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:3194