Bug 2152537

Summary: [4.13]Better to have a more friendly error when missing storage size in clone
Product: Container Native Virtualization (CNV) Reporter: Yan Du <yadu>
Component: StorageAssignee: Álvaro Romero <alromero>
Status: CLOSED ERRATA QA Contact: Yan Du <yadu>
Severity: high Docs Contact:
Priority: high    
Version: 4.12.0   
Target Milestone: ---   
Target Release: 4.13.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: v4.13.0.rhel9-1089 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2165594 (view as bug list) Environment:
Last Closed: 2023-05-18 02:56:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2165594    

Description Yan Du 2022-12-12 09:30:34 UTC
Description of problem:
Better to have a more friendly error when missing storage size in clone

Version-Release number of selected component (if applicable):
CNV 4.12 

How reproducible:
Always

Steps to Reproduce:
1. Clone from a pvc without setting storage size in DataVolume.spec.storage.resources

---
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata: 
  name: clone4
  annotations:
    cdi.kubevirt.io/cloneType: ""
    cdi.kubevirt.io/storage.bind.immediate.requested: "true"
    cdi.kubevirt.io/storage.deleteAfterCompletion: "false"
spec: 
  source: 
    pvc: 
      name: fedora-56ccabc01cbe
      namespace: openshift-virtualization-os-images
  storage:
    resources: {}



Actual results:
1. The DV keeps in CloneScheduled status, no progress
2. Describe the dv, only got warning "The size detection pod is not finished yet" 
Events:
  Type     Reason                        Age                     From                   Message
  ----     ------                        ----                    ----                   -------
  Normal   SizeDetectionPodCreated       4m44s                   datavolume-controller  Size-detection pod created
  Normal   CloneScheduled                4m44s                   datavolume-controller  Cloning from openshift-virtualization-os-images/fedora-56ccabc01cbe into default/clone4 scheduled
  Normal   CloneScheduled                4m44s                   datavolume-controller  No PVC found
  Normal   SizeDetectionPodNotReady      4m42s (x11 over 4m44s)  datavolume-controller  The size detection pod is not finished yet
  Warning  HostAssistedCloneSourceInUse  4m42s (x11 over 4m44s)  datavolume-controller  pod openshift-virtualization-os-images/size-detection-ae130582-c792-4677-86dd-f78c8c243ee8 using PersistentVolumeClaim fedora-56ccabc01cbe


cdi deployment pod log:
{"level":"debug","ts":1670816776.3534229,"logger":"events","msg":"Normal","object":{"kind":"DataVolume","namespace":"default","name":"clone4","uid":"349bdb29-09e6-4a0a-9ade-ee4ac4582ade","apiVersion":"cdi.kubevirt.io/v1beta1","resourceVersion":"10785711"},"reason":"SizeDetectionPodNotReady","message":"The size detection pod is not finished yet"}
{"level":"debug","ts":1670816776.3534684,"logger":"events","msg":"Warning","object":{"kind":"DataVolume","namespace":"default","name":"clone4","uid":"349bdb29-09e6-4a0a-9ade-ee4ac4582ade","apiVersion":"cdi.kubevirt.io/v1beta1","resourceVersion":"10785711"},"reason":"HostAssistedCloneSourceInUse","message":"pod openshift-virtualization-os-images/size-detection-ae130582-c792-4677-86dd-f78c8c243ee8 using PersistentVolumeClaim fedora-56ccabc01cbe"}


Expected results:
Better to have a more friendly error when missing storage size in clone
such as: missing storage size in DataVolume

Additional info:

Comment 2 Álvaro Romero 2022-12-21 09:36:38 UTC
Hi @yadu! I'm taking a look at this but everything seems to be working as expected.

We support creating clones without size: When a clone without storage size is created, we run a binary inside a 'size-detection' pod that checks the image size of the source and uses it in the target. You can check this in the events:

  Type     Reason                        Age                     From                   Message
  ----     ------                        ----                    ----                   -------
  Normal   SizeDetectionPodCreated       4m44s                   datavolume-controller  Size-detection pod created
  Normal   SizeDetectionPodNotReady      4m42s (x11 over 4m44s)  datavolume-controller  The size detection pod is not finished yet

The other events are also expected behavior, so everything implies that the clone will be successful. Is the DV frozen in CloneScheduled?

Comment 3 Yan Du 2022-12-22 06:00:48 UTC
Hi, Alvaro

Yes, the DV keeps in CloneScheduled.
$ oc get dv
NAME     PHASE            PROGRESS   RESTARTS   AGE
clone4   CloneScheduled                         23m

The dv yaml and describe log is in attachment.

Comment 4 Yan Du 2022-12-22 06:01:25 UTC
Created attachment 1934074 [details]
dv

Comment 5 Álvaro Romero 2022-12-22 09:12:17 UTC
You are right @yadu, there seems to be a problem when using the size-detection pod in clones between different namespaces. I'll investigate it.

Comment 6 Álvaro Romero 2023-01-13 12:28:42 UTC
Hi @yadu 

I've just seen that the fix is already bundled.
The bug ended up being a pod-ownership error when cloning without size between namespaces.

Thanks for the report!

Comment 8 Yan Du 2023-02-08 11:05:52 UTC
Test on CNV-v4.13.0.rhel9-1405

$ oc get dv
NAME     PHASE       PROGRESS   RESTARTS   AGE
clone4   Succeeded   100.0%                7m51s
$ oc get pvc
NAME     STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS         AGE
clone4   Bound    pvc-a563919f-0ca8-4df5-b046-74afc01b5e2e   149Gi      RWO            hostpath-csi-basic   7m52s

Comment 10 errata-xmlrpc 2023-05-18 02:56:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.13.0 Images security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:3205