Bug 1705080

Summary: CDI: Unable to upload image using virtctl
Product: Container Native Virtualization (CNV) Reporter: Natalie Gavrielov <ngavrilo>
Component: StorageAssignee: Michael Henriksen <mhenriks>
Status: CLOSED CURRENTRELEASE QA Contact: Natalie Gavrielov <ngavrilo>
Severity: high Docs Contact:
Priority: high    
Version: 2.0CC: alitke, cnv-qe-bugs, igoihman, ncredi, rav, ycui
Target Milestone: ---Keywords: Regression, TestBlocker
Target Release: 2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 1.9.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-10-22 12:32:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1707418, 1725730    
Bug Blocks: 1679134    

Description Natalie Gavrielov 2019-05-01 12:24:14 UTC
Description of problem:
Unable to upload an image using virtctl

Version-Release number of selected component:
      IMPORTER_IMAGE:       container-native-virtualization/virt-cdi-importer:v2.0.0
      CLONER_IMAGE:         container-native-virtualization/virt-cdi-cloner:v2.0.0
      UPLOADSERVER_IMAGE:   container-native-virtualization/virt-cdi-uploadserver:v2.0.0
      UPLOADPROXY_SERVICE:  cdi-uploadproxy

How reproducible:
100%

Steps to Reproduce:
1. get an image to upload
wget https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img
2. upload image using virtctl:
virtctl image-upload --uploadproxy-url=https://cdi-uploadproxy-cdi.apps.working.oc4m --pvc-name=upload-test --image-path=cirros-0.4.0-x86_64-disk.raw --insecure --pvc-size=5Gi --storage-class local-sc

Actual results:
Doesn't work:

PVC local-storage/upload-test created
Waiting for PVC upload-test upload pod to be running...
Pod now running
the server is currently unable to handle the request (post uploadtokenrequests.upload.cdi.kubevirt.io)
#

Expected results:
For the upload to succeed.

Additional info:
# oc get pods -w
NAME                                      READY     STATUS              RESTARTS   AGE
cdi-upload-upload-test                    0/1       ContainerCreating   0          8s
local-disks-local-provisioner-zp6kx       1/1       Running             0          45m
local-diskslocal-diskmaker-c6pqv          1/1       Running             0          45m
local-storage-manifests-58gz5             1/1       Running             0          45m
local-storage-operator-694f46c9c9-fv5m2   1/1       Running             0          45m
cdi-upload-upload-test   0/1       ContainerCreating   0         9s
---------------------------------------------------------------------
# oc logs -f cdi-upload-upload-test
I0501 12:12:08.978176       1 uploadserver.go:70] Upload destination: /data/disk.img
I0501 12:12:08.978513       1 uploadserver.go:72] Running server on 0.0.0.0:8443
---------------------------------------------------------------------
# oc get pods -n cdi
NAME                               READY     STATUS    RESTARTS   AGE
cdi-apiserver-6cff65bb6c-hvkgj     1/1       Running   0          2d1h
cdi-deployment-5774b4b8dd-nbgk4    1/1       Running   3          2d1h
cdi-operator-7454cb5b85-sc8kv      1/1       Running   2          2d1h
cdi-uploadproxy-6d64cc5bc7-d8sb6   1/1       Running   0          2d1h
---------------------------------------------------------------------
# oc logs cdi-uploadproxy-6d64cc5bc7-d8sb6 -n cdi
I0429 10:59:50.253522       1 uploadproxy.go:54] Note: increase the -v level in the api deployment for more detailed logging, eg. -v=2 or -v=3
W0429 10:59:50.255060       1 client_config.go:548] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
---------------------------------------------------------------------
# oc get route -n cdi
NAME              HOST/PORT                              PATH      SERVICES          PORT      TERMINATION   WILDCARD
cdi-uploadproxy   cdi-uploadproxy-cdi.apps.working.oc4             cdi-uploadproxy   <all>     reencrypt     None
---------------------------------------------------------------------
# oc get pvc
NAME                  STATUS    VOLUME              CAPACITY   ACCESS MODES   STORAGECLASS   AGE
upload-test           Bound     local-pv-b58602cb   25Gi       RWO            local-sc       4m20s
upload-test-scratch   Bound     local-pv-d40a467c   25Gi       RWO            local-sc       4m20s
---------------------------------------------------------------------
# oc get pv
NAME                CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                               STORAGECLASS   REASON    AGE
local-pv-b58602cb   25Gi       RWO            Delete           Bound     local-storage/upload-test           local-sc                 22m
local-pv-d40a467c   25Gi       RWO            Delete           Bound     local-storage/upload-test-scratch   local-sc                 22m

Comment 2 Michael Henriksen 2019-05-02 15:35:37 UTC
This is fixed in CDI release 1.9 via this PR:  https://github.com/kubevirt/containerized-data-importer/pull/757

Comment 3 Natalie Gavrielov 2019-05-15 16:04:55 UTC
Verified using cdi v1.9.0 (created a data volume using import of a cirros image from https + running a vmi using that disk)
Note: In cases where the cluster is up longer then a day, the upload will fail because of certificate rotation not being handled by CDI. The workaround, for now, is deleting the api server pod.
Since the certificate rotation issue is handled in a jira card CNV-1762, I'm closing this issue as verified.

Comment 4 Adam Litke 2019-06-12 13:58:05 UTC
We determined that this fix has not been backported to release-1.9 yet.

Comment 5 Adam Litke 2019-06-12 14:04:59 UTC
Just kidding.  Irit found the backport in the current release.

Comment 7 Rajath AV 2020-11-12 21:48:37 UTC
Im having the same issue in my cluster. How do I fix that? Any help would be very much appreciated. Im running openshift 4.4.3

Comment 8 Rajath AV 2020-11-12 21:51:41 UTC
additional info: My Cluster is just 4 hours old. Do I need to wait for 24 hours?

Comment 10 Red Hat Bugzilla 2023-09-14 05:27:50 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days