Bug 1829376 - VMs with blank block volumes fail to spin up
Summary: VMs with blank block volumes fail to spin up
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Storage
Version: 2.2.0
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: 2.4.0
Assignee: Alexander Wels
QA Contact: Kevin Alon Goldblatt
URL:
Whiteboard:
Depends On:
Blocks: 1829438
TreeView+ depends on / blocked
 
Reported: 2020-04-29 13:20 UTC by David Critch
Modified: 2020-07-28 19:10 UTC (History)
5 users (show)

Fixed In Version: 2.4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1829438 (view as bug list)
Environment:
Last Closed: 2020-07-28 19:10:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt containerized-data-importer pull 1192 0 None closed Fixed race for block blank disks 2020-09-25 15:51:14 UTC
Red Hat Product Errata RHSA-2020:3194 0 None None None 2020-07-28 19:10:21 UTC

Description David Critch 2020-04-29 13:20:12 UTC
Description of problem:
We're attempting to spin up a VM backed by a DataVolume. If that DataVolume is set with volumeMode: block and the source is blank, the PVC and DV get created, but no VM pod ever spins up.

Version-Release number of selected component (if applicable):
hyperconverged-cluster-operator:v2.2.0-12
virt-cdi-operator:v2.2.0-5


How reproducible:
Always

Steps to Reproduce:
1. Create a VM template with a DV, with volumeMode: block and source: blank: {}
2. oc create -f <vm template>


Actual results:
PVC is created
DV is created, but with status:
  status:
    phase: PVCBound
rather than:
  status:
    phase: Succeeded
    progress: 100.0%

VM is created and says it is running, but no VMI/pod exist:

# oc get vm/dc-vm-2
NAME      AGE     RUNNING   VOLUME
dc-vm-2   7m12s   true      




Expected results:
DV 'succeeds' and the VM starts

Additional info:
debugging info requested in email:
# cat dv-test.yml
apiVersion: cdi.kubevirt.io/v1alpha1
kind: DataVolume
metadata:
  name: dv-block-blank
spec:
  pvc:
    accessModes:
    - ReadWriteMany
    dataSource: null
    resources:
      requests:
        storage: 10Gi
    storageClassName: ocs-storagecluster-ceph-rbd
    volumeMode: Block
  source:
    blank: {}
# oc create -f dv-test.yml
datavolume.cdi.kubevirt.io/dv-block-blank created
# oc get dv/dv-block-blank -o yaml
apiVersion: cdi.kubevirt.io/v1alpha1
kind: DataVolume
metadata:
  creationTimestamp: "2020-04-29T00:08:02Z"
  generation: 3
  name: dv-block-blank
  namespace: ocs-cnv
  resourceVersion: "26471071"
  selfLink: /apis/cdi.kubevirt.io/v1alpha1/namespaces/ocs-cnv/datavolumes/dv-block-blank
  uid: 009892d3-1cdd-4f14-a7c0-a13f0abd9a34
spec:
  pvc:
    accessModes:
    - ReadWriteMany
    dataSource: null
    resources:
      requests:
        storage: 10Gi
    storageClassName: ocs-storagecluster-ceph-rbd
    volumeMode: Block
  source:
    blank: {}
status:
  phase: PVCBound
# oc get pvc/dv-block-blank -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    cdi.kubevirt.io/storage.contentType: kubevirt
    cdi.kubevirt.io/storage.import.source: none
    cdi.kubevirt.io/storage.pod.phase: Succeeded
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: openshift-storage.rbd.csi.ceph.com
  creationTimestamp: "2020-04-29T00:08:02Z"
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    app: containerized-data-importer
    cdi-controller: dv-block-blank
  name: dv-block-blank
  namespace: ocs-cnv
  ownerReferences:
  - apiVersion: cdi.kubevirt.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: DataVolume
    name: dv-block-blank
    uid: 009892d3-1cdd-4f14-a7c0-a13f0abd9a34
  resourceVersion: "26471082"
  selfLink: /api/v1/namespaces/ocs-cnv/persistentvolumeclaims/dv-block-blank
  uid: 096fff16-acf3-424e-9012-4bcc89a754c9
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  storageClassName: ocs-storagecluster-ceph-rbd
  volumeMode: Block
  volumeName: pvc-096fff16-acf3-424e-9012-4bcc89a754c9
status:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 10Gi
  phase: Bound
# oc get dv/dv-block-blank -o yaml
apiVersion: cdi.kubevirt.io/v1alpha1
kind: DataVolume
metadata:
  creationTimestamp: "2020-04-29T00:22:38Z"
  generation: 3
  name: dv-block-blank
  namespace: ocs-cnv
  resourceVersion: "26479912"
  selfLink: /apis/cdi.kubevirt.io/v1alpha1/namespaces/ocs-cnv/datavolumes/dv-block-blank
  uid: 8305760e-ef32-4c7a-9b3c-82dc40990949
spec:
  pvc:
    accessModes:
    - ReadWriteOnce
    dataSource: null
    resources:
      requests:
        storage: 10Gi
    storageClassName: ocs-storagecluster-ceph-rbd
    volumeMode: Block
  source:
    blank: {}
status:
  phase: Succeeded
  progress: 100.0%

Comment 1 Alexander Wels 2020-04-29 14:06:52 UTC
This is a bug in CDI, it is not always updating the phase of the DV to be successful, which is causing KubeVirt to not spin up a VMI. The VM will have running set to true, which normally would create the VMI, but due to the DV not being in succeeded causes the VMI to not get created.

Comment 2 Adam Litke 2020-04-29 14:21:15 UTC
Workaround: Creating a blank disk on a block volume is a no-op for CDI since the empty block device can be used directly by qemu.  Therefore, as a workaround for this problem you can directly use a PVC as the disk in your VM spec.  For example:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-blank-disk
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  storageClassName: ocs-storagecluster-ceph-rbd
  volumeMode: Block

and use it in your VM like so:

...
    spec:
      domain:
        devices:
          disks:
          - disk:
              bus: virtio
            name: blankdisk
...
      volumes:
      - name: blankdisk
        persistentVolumeClaim:
          claimName: pvc-blank-disk
...

Comment 3 David Critch 2020-04-29 15:11:42 UTC
Confirmed, the workaround works. Thanks!

Comment 4 Alexander Wels 2020-04-30 12:12:26 UTC
Merged fix upstream

Comment 5 Kevin Alon Goldblatt 2020-06-25 18:30:29 UTC
Verified with the following code:
------------------------------------------
oc version
Client Version: 4.5.0-rc.1
Server Version: 4.5.0-rc.1
Kubernetes Version: v1.18.3+a637491



oc get csv --all-namespaces |grep cnv
openshift-cnv                          kubevirt-hyperconverged-operator.v2.4.0      OpenShift virtualization      2.4.0                                                              Succeeded




Verified with the following scenario:
------------------------------------------
Created a datavolume
PVC is bound
DV is progress 100.0% 


Moving to VERIFIED!


dv.yaml
-----------------------------------------
apiVersion: cdi.kubevirt.io/v1alpha1
kind: DataVolume
metadata:
  name: dv-block-blank
spec:
  pvc:
    accessModes:
    - ReadWriteMany
    dataSource: null
    resources:
      requests:
        storage: 10Gi
    storageClassName: ocs-storagecluster-ceph-rbd
    volumeMode: Block
  source:
    blank: {}





oc get pvc
NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                  AGE
dv-block-blank   Bound    pvc-2941a722-1a04-4bfa-b559-e80fbd37999e   10Gi       RWX            ocs-storagecluster-ceph-rbd   158m


oc get dv
NAME             PHASE       PROGRESS   RESTARTS   AGE
dv-block-blank   Succeeded   100.0%     0          158m


oc describe  dv dv-block-blank 
Name:         dv-block-blank
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  cdi.kubevirt.io/v1alpha1
Kind:         DataVolume
Metadata:
  Creation Timestamp:  2020-06-25T15:29:59Z
  Generation:          6
  Managed Fields:
    API Version:  cdi.kubevirt.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        .:
        f:pvc:
          .:
          f:accessModes:
          f:resources:
            .:
            f:requests:
              .:
              f:storage:
          f:storageClassName:
          f:volumeMode:
        f:source:
          .:
          f:blank:
    Manager:      oc
    Operation:    Update
    Time:         2020-06-25T15:29:59Z
    API Version:  cdi.kubevirt.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:conditions:
        f:phase:
        f:progress:
        f:restartCount:
    Manager:         virt-cdi-controller
    Operation:       Update
    Time:            2020-06-25T15:30:12Z
  Resource Version:  4541247
  Self Link:         /apis/cdi.kubevirt.io/v1alpha1/namespaces/default/datavolumes/dv-block-blank
  UID:               cdb5083b-1cf5-423e-bc4a-f4471c59ab08
Spec:
  Pvc:
    Access Modes:
      ReadWriteMany
    Resources:
      Requests:
        Storage:         10Gi
    Storage Class Name:  ocs-storagecluster-ceph-rbd
    Volume Mode:         Block
  Source:
    Blank:
Status:
  Conditions:
    Last Heart Beat Time:  2020-06-25T15:30:12Z
    Last Transition Time:  2020-06-25T15:30:12Z
    Message:               PVC dv-block-blank Bound
    Reason:                Bound
    Status:                True
    Type:                  Bound
    Last Heart Beat Time:  2020-06-25T15:30:11Z
    Last Transition Time:  2020-06-25T15:30:11Z
    Status:                True
    Type:                  Ready
    Last Heart Beat Time:  2020-06-25T15:29:59Z
    Last Transition Time:  2020-06-25T15:29:59Z
    Status:                False
    Type:                  Running
  Phase:                   Succeeded
  Progress:                100.0%
  Restart Count:           0
Events:
  Type    Reason           Age   From                   Message
  ----    ------           ----  ----                   -------
  Normal  Pending          157m  datavolume-controller  PVC dv-block-blank Pending
  Normal  ImportSucceeded  157m  datavolume-controller  Successfully imported into PVC dv-block-blank
  Normal  Bound            156m  datavolume-controller  PVC dv-block-blank Bound



oc describe pvc dv-
dv-block-blank  dv-cnv-2354     
[cnv-qe-jenkins@cnv-executor-dafrank ~]$ oc get ovc
error: the server doesn't have a resource type "ovc"
[cnv-qe-jenkins@cnv-executor-dafrank ~]$ oc get pvc
NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                  AGE
dv-block-blank   Bound    pvc-2941a722-1a04-4bfa-b559-e80fbd37999e   10Gi       RWX            ocs-storagecluster-ceph-rbd   157m
dv-cnv-2354      Bound    pvc-dc2324a4-6319-49cb-bd27-18af0ebfb606   5Gi        RWX            ocs-storagecluster-ceph-rbd   8h
[cnv-qe-jenkins@cnv-executor-dafrank ~]$ oc describe pvc dv-block-blank 
Name:          dv-block-blank
Namespace:     default
StorageClass:  ocs-storagecluster-ceph-rbd
Status:        Bound
Volume:        pvc-2941a722-1a04-4bfa-b559-e80fbd37999e
Labels:        app=containerized-data-importer
Annotations:   cdi.kubevirt.io/storage.contentType: kubevirt
               cdi.kubevirt.io/storage.import.source: none
               cdi.kubevirt.io/storage.pod.phase: Succeeded
               cdi.kubevirt.io/storage.pod.restarts: 0
               pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: openshift-storage.rbd.csi.ceph.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      10Gi
Access Modes:  RWX
VolumeMode:    Block
Mounted By:    <none>
Events:
  Type    Reason                 Age   From                                                                                                                Message
  ----    ------                 ----  ----                                                                                                                -------
  Normal  ExternalProvisioning   157m  persistentvolume-controller                                                                                         waiting for a volume to be created, either by external provisioner "openshift-storage.rbd.csi.ceph.com" or manually created by system administrator
  Normal  Provisioning           157m  openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-669d77666d-mxm4l_f3789bed-de7d-4fd8-a9e4-6e38c8d93496  External provisioner is provisioning volume for claim "default/dv-block-blank"
  Normal  ProvisioningSucceeded  157m  openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-669d77666d-mxm4l_f3789bed-de7d-4fd8-a9e4-6e38c8d93496  Successfully provisioned volume pvc-2941a722-1a04-4bfa-b559-e80fbd37999e

Comment 8 errata-xmlrpc 2020-07-28 19:10:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:3194


Note You need to log in before you can comment on or make changes to this bug.