Bug 2145223 - VM with missing source datasource pvc is started without any error messages
Summary: VM with missing source datasource pvc is started without any error messages
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Storage
Version: 4.12.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.13.0
Assignee: Álvaro Romero
QA Contact: Yan Du
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-11-23 14:53 UTC by vsibirsk
Modified: 2023-05-18 02:56 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-05-18 02:55:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt kubevirt pull 9170 0 None Merged Improve handling of clone DataVolumes without source PVC 2023-03-08 02:19:12 UTC
Github kubevirt kubevirt pull 9227 0 None Merged [release-0.59] Improve handling of clone DataVolumes without source PVC 2023-03-08 02:19:16 UTC
Red Hat Issue Tracker CNV-22837 0 None None None 2022-11-23 15:01:08 UTC
Red Hat Product Errata RHSA-2023:3205 0 None None None 2023-05-18 02:56:05 UTC

Description vsibirsk 2022-11-23 14:53:02 UTC
Description of problem:
If a VM's DataSource is missing PVC and VM is started, no error message is displayed on VM object


Version-Release number of selected component (if applicable):
4.12

How reproducible:
100%

Steps to Reproduce:
1.Create VM from template with non-existing Data Source
2.Start VM
3.

Actual results:
VM starts, DV is created and is stuck with error "No PVC found"

Expected results:
Error like "Source PVC openshift-virtualization-os-images/rhel9 not found" is shown on VM

Additional info:

VM yaml
---------------
apiVersion: kubevirt.io/v1
  kind: VirtualMachine
  metadata:
    annotations:
      kubemacpool.io/transaction-timestamp: "2022-11-23T14:44:04.011124494Z"
      kubevirt.io/latest-observed-api-version: v1
      kubevirt.io/storage-observed-api-version: v1alpha3
      vm.kubevirt.io/validations: |
        [
          {
            "name": "minimal-required-memory",
            "path": "jsonpath::.spec.domain.resources.requests.memory",
            "rule": "integer",
            "message": "This VM requires more memory.",
            "min": 1610612736
          }
        ]
    creationTimestamp: "2022-11-23T14:44:03Z"
    generation: 2
    labels:
      app: rhel9-vm-1669214643-5756936
      vm.kubevirt.io/template: rhel9-server-tiny
      vm.kubevirt.io/template.namespace: openshift
      vm.kubevirt.io/template.revision: "1"
      vm.kubevirt.io/template.version: v0.24.1
    name: rhel9-vm-1669214643-5756936
    namespace: update-boot-source-test-ssp-common-templates-boot-sources
    resourceVersion: "2920736"
    uid: 87ce1da2-477c-43a0-83c8-55240261778a
  spec:
    dataVolumeTemplates:
    - apiVersion: cdi.kubevirt.io/v1beta1
      kind: DataVolume
      metadata:
        creationTimestamp: null
        name: rhel9-vm-1669214643-5756936
      spec:
        sourceRef:
          kind: DataSource
          name: rhel9
          namespace: openshift-virtualization-os-images
        storage:
          resources:
            requests:
              storage: 30Gi
    running: true
    template:
      metadata:
        annotations:
          vm.kubevirt.io/flavor: tiny
          vm.kubevirt.io/os: rhel9
          vm.kubevirt.io/workload: server
        creationTimestamp: null
        labels:
          kubevirt.io/domain: rhel9-vm-1669214643-5756936
          kubevirt.io/size: tiny
          kubevirt.io/vm: rhel9-vm-1669214643-5756936
      spec:
        domain:
          cpu:
            cores: 1
            sockets: 1
            threads: 1
          devices:
            disks:
            - disk:
                bus: virtio
              name: rootdisk
            - disk:
                bus: virtio
              name: cloudinitdisk
            interfaces:
            - macAddress: 02:ca:92:00:00:02
              masquerade: {}
              model: virtio
              name: default
            networkInterfaceMultiqueue: true
            rng: {}
          features:
            acpi: {}
            smm:
              enabled: true
          firmware:
            bootloader:
              efi: {}
          machine:
            type: pc-q35-rhel8.6.0
          resources:
            requests:
              memory: 1536Mi
        evictionStrategy: LiveMigrate
        networks:
        - name: default
          pod: {}
        terminationGracePeriodSeconds: 180
        volumes:
        - dataVolume:
            name: rhel9-vm-1669214643-5756936
          name: rootdisk
        - cloudInitNoCloud:
            userData: |-
              #cloud-config
              user: cloud-user
              password: password
              chpasswd: { expire: False }
              ssh_authorized_keys:
               [ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCj47ubVnxR16JU7ZfDli3N5QVBAwJBRh2xMryyjk5dtfugo5JIPGB2cyXTqEDdzuRmI+Vkb/A5duJyBRlA+9RndGGmhhMnj8and3wu5/cEb7DkF6ZJ25QV4LQx3K/i57LStUHXRTvruHOZ2nCuVXWqi7wSvz5YcvEv7O8pNF5uGmqHlShBdxQxcjurXACZ1YY0YDJDr3AJai1KF9zehVJODuSbrnOYpThVWGjFuFAnNxbtuZ8EOSougN2aYTf2qr/KFGDHtewIkzZmP6cjzKO5bN3pVbXxmb2Gces/BYHntY4MXBTUqwsmsCRC5SAz14bEP/vsLtrNhjq9vCS+BjMT root]
              runcmd: ['grep ssh-rsa /etc/crypto-policies/back-ends/opensshserver.config || sudo update-crypto-policies --set LEGACY || true', "sudo sed -i 's/^#\\?PasswordAuthentication no/PasswordAuthentication yes/g' /etc/ssh/sshd_config", 'sudo systemctl enable sshd', 'sudo systemctl restart sshd']
          name: cloudinitdisk
  status:
    conditions:
    - lastProbeTime: "2022-11-23T14:44:04Z"
      lastTransitionTime: "2022-11-23T14:44:04Z"
      message: VMI does not exist
      reason: VMINotExists
      status: "False"
      type: Ready
    printableStatus: Stopped
    volumeSnapshotStatuses:
    - enabled: false
      name: rootdisk
      reason: PVC not found
    - enabled: false
      name: cloudinitdisk
      reason: Snapshot is not supported for this volumeSource type [cloudinitdisk]
kind: List
metadata:
  resourceVersion: ""


DV yaml
------------------
apiVersion: v1
items:
- apiVersion: cdi.kubevirt.io/v1beta1
  kind: DataVolume
  metadata:
    annotations:
      cdi.kubevirt.io/cloneType: ""
      cdi.kubevirt.io/storage.clone.token: eyJhbGciOiJQUzI1NiJ9.eyJleHAiOjE2NjkyMTQ5NDQsImlhdCI6MTY2OTIxNDY0NCwiaXNzIjoiY2RpLWFwaXNlcnZlciIsIm5hbWUiOiJyaGVsOSIsIm5hbWVzcGFjZSI6Im9wZW5zaGlmdC12aXJ0dWFsaXphdGlvbi1vcy1pbWFnZXMiLCJuYmYiOjE2NjkyMTQ2NDQsIm9wZXJ0YXRpb24iOiJDbG9uZSIsInBhcmFtcyI6eyJ0YXJnZXROYW1lIjoicmhlbDktdm0tMTY2OTIxNDY0My01NzU2OTM2IiwidGFyZ2V0TmFtZXNwYWNlIjoidXBkYXRlLWJvb3Qtc291cmNlLXRlc3Qtc3NwLWNvbW1vbi10ZW1wbGF0ZXMtYm9vdC1zb3VyY2VzIn0sInJlc291cmNlIjp7Imdyb3VwIjoiIiwicmVzb3VyY2UiOiJwZXJzaXN0ZW50dm9sdW1lY2xhaW1zIiwidmVyc2lvbiI6InYxIn19.Npnkf5a3yLoDdA27xO9FmHM2DF1Dg1h2sjTvHMDABhebSTOtUofUBoWHfPTIwfUdiT-afCXFCcFS6gB3mlY7foBz-Ft3d4wUPTuz_PtrJwhPULvtLLWOIWcTLs2ABAhRPNdqgYDItM32RU_GaEcXTL10DSEi0xpk3NJdqQiZ5oiQtHehczGrdmGOa0NFx0QcqI2rB6OGyrPMn-b-G-W0nFgA-Fes6lvKxEJfefJmLSNY-pY_bdBX3va46UiSo9qgp36IzQ8vuX_lSXBbXOAaD1rVSksXLObJEBfq_KcMOItRXodBzbtYRUZImRtKEVrJXDJ3jy3zuqDxFz5iLiK7bQ
    creationTimestamp: "2022-11-23T14:44:04Z"
    generation: 2
    labels:
      kubevirt.io/created-by: 87ce1da2-477c-43a0-83c8-55240261778a
    name: rhel9-vm-1669214643-5756936
    namespace: update-boot-source-test-ssp-common-templates-boot-sources
    ownerReferences:
    - apiVersion: kubevirt.io/v1
      blockOwnerDeletion: true
      controller: true
      kind: VirtualMachine
      name: rhel9-vm-1669214643-5756936
      uid: 87ce1da2-477c-43a0-83c8-55240261778a
    resourceVersion: "2920735"
    uid: 5a2234a9-ffe2-465a-b196-6e205eba7b26
  spec:
    source:
      pvc:
        name: rhel9
        namespace: openshift-virtualization-os-images
    storage:
      resources:
        requests:
          storage: 30Gi
  status:
    conditions:
    - lastHeartbeatTime: "2022-11-23T14:44:04Z"
      lastTransitionTime: "2022-11-23T14:44:04Z"
      message: No PVC found
      reason: CloneWithoutSource
      status: Unknown
      type: Bound
    - lastHeartbeatTime: "2022-11-23T14:44:04Z"
      lastTransitionTime: "2022-11-23T14:44:04Z"
      reason: CloneWithoutSource
      status: "False"
      type: Ready
    - lastHeartbeatTime: "2022-11-23T14:44:04Z"
      lastTransitionTime: "2022-11-23T14:44:04Z"
      status: "False"
      type: Running
kind: List
metadata:
  resourceVersion: ""

Comment 1 vsibirsk 2022-11-23 15:00:57 UTC
Events on VM object after starting it:
(in 4.12)
LAST SEEN   TYPE     REASON                       OBJECT                                       MESSAGE
84s         Normal   SuccessfulDataVolumeCreate   virtualmachine/rhel9-vm-1669204346-7126756   Created DataVolume rhel9-vm-1669204346-7126756

(in 4.10)
LAST SEEN   TYPE      REASON                   OBJECT                                       MESSAGE
0s          Warning   FailedDataVolumeCreate   virtualmachine/rhel9-vm-1669204608-0563674   Error creating DataVolume rhel9-vm-1669204608-0563674: admission webhook "datavolume-validate.cdi.kubevirt.io" denied the request:  Source PVC openshift-virtualization-os-images/rhel9 not found

Comment 2 sgott 2022-11-23 15:08:16 UTC
Adam, it's not immediately clear to me if the component of this BZ should be Virtualization or Storage. What do you think?

Comment 3 Adam Litke 2022-12-07 14:39:38 UTC
Alvaro, let's see if we can address this usability issue.

Comment 5 Álvaro Romero 2023-01-24 12:51:32 UTC
Hey @vsibirsk, thanks for reporting this issue.

After looking at this, I'm not sure if we should go back to trigger an error in this case.

This change in DV admission was a deliberate attempt to make our DataVolumes work more declaratively (https://github.com/kubevirt/containerized-data-importer/pull/2306), following the way other k8s objects work. If a source PVC doesn't exist when trying to clone, the DataVolume should be created anyway and wait until this PVC is created. Once the source PVC is created, the clone will continue as expected. We considered that returning an error wasn't appropriate since the DataVolume would easily recover after creating the source.

If needed, we can search for this specific DV status in Kubevirt and trigger a more descriptive event in the VM, but I think that may be nastier than our current behavior.

Let me know what you think.

Comment 6 vsibirsk 2023-02-01 14:45:18 UTC
Hi Alvaro,
Yes, I believe the VM should show a proper event that indicates what is the problem and why it is not starting.
At lease from user perspective it sounds more logical to me then see no clear indication of what is wrong and then manually go through all dependent objects and check them for errors

Comment 7 Yan Du 2023-03-08 02:34:53 UTC
Test on CNV-v4.13.0.rhel9-1689
Created VM from template with non-existing Data Source, got SourcePVCNotAvailabe warning in VM events:

Events:
  Type     Reason                      Age   From                       Message
  ----     ------                      ----  ----                       -------
  Warning  SourcePVCNotAvailabe        11s   virtualmachine-controller  Source PVC non-existing-pvc not available: Target PVC fedora-bktiycydg31uiy2h will remain unpopulated until source is created

Comment 10 errata-xmlrpc 2023-05-18 02:55:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.13.0 Images security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:3205


Note You need to log in before you can comment on or make changes to this bug.