Bug 1754920 - Large images met timeout or EOF when uploading a qcow2 image
Summary: Large images met timeout or EOF when uploading a qcow2 image
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Documentation
Version: 2.1.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 2.2.0
Assignee: Bob Gaydos
QA Contact: Irina Gulina
URL:
Whiteboard:
Depends On:
Blocks: 1789093
TreeView+ depends on / blocked
 
Reported: 2019-09-24 11:21 UTC by Yan Du
Modified: 2020-02-06 19:50 UTC (History)
10 users (show)

Fixed In Version: 2.2.0
Doc Type: Known Issue
Doc Text:
Cause: An idle upload connection is closed before server-side processing can complete. Consequence: An EOF error is returned to the user even though the operation continues normally on the server Workaround (if any): Check the status of the operation by examining the CDI annotations on the affected PVC Result: A successful upload is confirmed
Clone Of:
: 1789093 (view as bug list)
Environment:
Last Closed: 2020-01-30 20:02:40 UTC
Target Upstream Version:
Embargoed:
alitke: needinfo-
alitke: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt containerized-data-importer pull 996 0 None closed uploadproxy should use pod ready annotation 2021-01-14 10:19:17 UTC

Description Yan Du 2019-09-24 11:21:41 UTC
Description of problem:
Always met timeout or EOF when uploading an qcow2 image

Version-Release number of selected component (if applicable):
cnv2.1

How reproducible:
Always

Steps to Reproduce:
1. Upload an image by virtctl image-upload
2.
3.

Actual results:
$ virtctl image-upload  --pvc-name=winpv --pvc-size=10Gi --image-path=/mnt/cnv-netapp/WindowsImages/qe-images/WINDOWS_2019_X64/disk.qcow2 --insecure --block-volume
PVC default/winpv created
Waiting for PVC winpv upload pod to be running...
Pod now running
Uploading data to https://cdi-uploadproxy-openshift-cnv.apps.working.oc4

 5.73 GiB / 5.73 GiB [=========================================================================================================================================================] 100.00% 3m28s

Post https://cdi-uploadproxy-openshift-cnv.apps.working.oc4/v1alpha1/upload: EOF



or 

$ virtctl image-upload  --pvc-name=winpv1 --pvc-size=10Gi --image-path=/mnt/cnv-netapp/WindowsImages/qe-images/WINDOWS_2019_X64/disk.qcow2 --insecure --block-volume
PVC default/winpv1 created
timed out waiting for the condition


Expected results:
upload succeed

Additional info:

Comment 1 Natalie Gavrielov 2019-09-25 12:26:00 UTC
Could this be a timeout during conversion to raw?

Comment 2 Alexander Wels 2019-09-25 12:29:37 UTC
Can you provide the uploadserver logs?

Comment 3 Yan Du 2019-09-26 02:42:39 UTC
The EOF happened during validating the image:

$ oc logs cdi-upload-winpv  --follow
I0926 02:27:57.144845       1 uploadserver.go:61] Upload destination: /dev/cdi-block-volume
I0926 02:27:57.145301       1 uploadserver.go:63] Running server on 0.0.0.0:8443
2019/09/26 02:28:06 http: TLS handshake error from 10.129.1.172:43478: EOF
I0926 02:28:06.449258       1 uploadserver.go:244] Content type header is ""
I0926 02:28:06.449278       1 data-processor.go:252] Calculating available size
I0926 02:28:06.471530       1 data-processor.go:256] Checking out block volume size.
I0926 02:28:06.473668       1 data-processor.go:264] Request image size not empty.
I0926 02:28:06.473690       1 data-processor.go:269] Target size 10Gi.
I0926 02:28:06.473873       1 data-processor.go:182] New phase: TransferScratch
I0926 02:28:06.476689       1 util.go:143] Writing data...
I0926 02:30:31.957120       1 data-processor.go:182] New phase: Process
I0926 02:30:31.957174       1 data-processor.go:182] New phase: Convert
I0926 02:30:31.957179       1 data-processor.go:188] Validating image
error: unexpected EOF


$ oc describe pvc
Name:          winpv
Namespace:     default
StorageClass:  rook-ceph-block
Status:        Bound
Volume:        pvc-27b5186e-e005-11e9-b28e-fa163e6997df
Labels:        <none>
Annotations:   cdi.kubevirt.io/storage.pod.phase: Succeeded
               cdi.kubevirt.io/storage.pod.ready: false
               cdi.kubevirt.io/storage.upload.target: 
               pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: rook-ceph.rbd.csi.ceph.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      10Gi
Access Modes:  RWO
VolumeMode:    Block
Events:
  Type       Reason                Age                    From                                                                                                      Message
  ----       ------                ----                   ----                                                                                                      -------
  Normal     Provisioning          6m51s                  rook-ceph.rbd.csi.ceph.com_csi-rbdplugin-provisioner-fd98c466-28944_29df07b3-dfc2-11e9-9fe5-0a580a81000e  External provisioner is provisioning volume for claim "default/winpv"
  Normal     ExternalProvisioning  6m48s (x2 over 6m50s)  persistentvolume-controller                                                                               waiting for a volume to be created, either by external provisioner "rook-ceph.rbd.csi.ceph.com" or manually created by system administrator
Mounted By:  <none>

Comment 4 Yan Du 2019-09-26 05:57:01 UTC
The above qcow2 image info:

$ qemu-img info disk.qcow2 
image: disk.qcow2
file format: qcow2
virtual size: 6.6G (7140155392 bytes)
disk size: 5.7G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false

If I convert the same qcow2 image to raw by qemu-img, then upload succeed.

$ qemu-img info disk_raw.img
image: disk_raw.img
file format: raw
virtual size: 6.6G (7140155392 bytes)
disk size: 5.7G


$ virtctl image-upload  --pvc-name=winpv --pvc-size=10Gi --image-path=/mnt/cnv-netapp/WindowsImages/qe-images/WINDOWS_2019_X64/disk_raw.img --insecure --block-volume
PVC default/winpv created
Waiting for PVC winpv upload pod to be running...
Pod now running
Uploading data to https://cdi-uploadproxy-openshift-cnv.apps.working.oc4

 6.65 GiB / 6.65 GiB [=======================================================================================================================================================================] 100.00% 1m47s

Uploading /mnt/cnv-netapp/WindowsImages/qe-images/WINDOWS_2019_X64/disk_raw.img completed successfully

$ oc logs cdi-upload-winpv --follow
I0926 05:50:53.540103       1 uploadserver.go:61] Upload destination: /dev/cdi-block-volume
I0926 05:50:53.540435       1 uploadserver.go:63] Running server on 0.0.0.0:8443
2019/09/26 05:51:02 http: TLS handshake error from 10.129.3.227:44558: EOF
I0926 05:51:02.534025       1 uploadserver.go:244] Content type header is ""
I0926 05:51:02.534043       1 data-processor.go:252] Calculating available size
I0926 05:51:02.556587       1 data-processor.go:256] Checking out block volume size.
I0926 05:51:02.559810       1 data-processor.go:264] Request image size not empty.
I0926 05:51:02.559846       1 data-processor.go:269] Target size 10Gi.
I0926 05:51:02.560289       1 data-processor.go:182] New phase: TransferDataFile
I0926 05:51:02.564028       1 util.go:143] Writing data...
I0926 05:52:50.305765       1 data-processor.go:182] New phase: Resize
I0926 05:52:50.424873       1 data-processor.go:182] New phase: Complete
I0926 05:52:50.424953       1 uploadserver.go:263] Wrote data to /dev/cdi-block-volume
I0926 05:52:50.425208       1 uploadserver.go:151] Shutting down http server after successful upload
I0926 05:52:50.425399       1 uploadserver.go:71] UploadServer successfully exited

Comment 5 Natalie Gavrielov 2019-09-26 13:23:22 UTC
I also got it when I used curl with token:
$ cat dv-upload.yaml
apiVersion: cdi.kubevirt.io/v1alpha1
kind: DataVolume
metadata:
  name: upload-datavolume
spec:
  source:
      upload: {}
  pvc:
    volumeMode: Block
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 500Mi

$ cat token.yaml
apiVersion: upload.cdi.kubevirt.io/v1alpha1
kind: UploadTokenRequest
metadata:
   name: upload-datavolume-token
   namespace: src
spec:
   pvcName: upload-datavolume


1. oc create -f dv-upload.yaml
2. TOKEN=$(kubectl apply -f token.yaml -o="jsonpath={.status.token}")
3. curl -v --insecure -H "Authorization: Bearer $TOKEN" --data-binary @cirros-0.4.0-x86_64-disk.qcow2  https://cdi-uploadproxy-openshift-cnv.apps.working.oc4/v1alpha1/upload


$ oc get pods
NAME                           READY   STATUS    RESTARTS   AGE
cdi-upload-upload-datavolume   0/1     Pending   0          11m

$ oc logs -f cdi-uploadproxy-5f7c458c79-jlmkh -n openshift-cnv
I0923 11:28:13.801613       1 uploadproxy.go:46] Note: increase the -v level in the api deployment for more detailed logging, eg. -v=2 or -v=3
W0923 11:28:13.802283       1 client_config.go:549] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2019/09/26 03:01:25 http: TLS handshake error from 10.128.0.1:56988: EOF
I0926 12:32:12.003289       1 uploadproxy.go:179] Received valid token: pvc: upload-datavolume, namespace: src
E0926 12:32:22.114270       1 uploadproxy.go:211] Error connecting to https://cdi-upload-upload-datavolume.src.svc/v1alpha1/upload: timed out 5 times connecting to https://cdi-upload-upload-datavolume.src.svc/v1alpha1/upload
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).testConnect
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:274
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).proxyUploadRequest
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:210
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).handleUploadRequest
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:188
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).handleUploadRequest-fm
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:140
net/http.HandlerFunc.ServeHTTP
	/usr/lib/golang/src/net/http/server.go:1964
net/http.(*ServeMux).ServeHTTP
	/usr/lib/golang/src/net/http/server.go:2361
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).ServeHTTP
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:144
net/http.serverHandler.ServeHTTP
	/usr/lib/golang/src/net/http/server.go:2741
net/http.(*conn).serve
	/usr/lib/golang/src/net/http/server.go:1847
runtime.goexit
	/usr/lib/golang/src/runtime/asm_amd64.s:1333
I0926 12:33:54.625050       1 uploadproxy.go:179] Received valid token: pvc: upload-datavolume-file, namespace: src
E0926 12:34:03.695972       1 uploadproxy.go:211] Error connecting to https://cdi-upload-upload-datavolume-file.src.svc/v1alpha1/upload: timed out 5 times connecting to https://cdi-upload-upload-datavolume-file.src.svc/v1alpha1/upload
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).testConnect
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:274
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).proxyUploadRequest
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:210
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).handleUploadRequest
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:188
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).handleUploadRequest-fm
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:140
net/http.HandlerFunc.ServeHTTP
	/usr/lib/golang/src/net/http/server.go:1964
net/http.(*ServeMux).ServeHTTP
	/usr/lib/golang/src/net/http/server.go:2361
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).ServeHTTP
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:144
net/http.serverHandler.ServeHTTP
	/usr/lib/golang/src/net/http/server.go:2741
net/http.(*conn).serve
	/usr/lib/golang/src/net/http/server.go:1847
runtime.goexit
	/usr/lib/golang/src/runtime/asm_amd64.s:1333
I0926 13:09:49.192674       1 uploadproxy.go:179] Received valid token: pvc: upload-datavolume-file, namespace: src
E0926 13:09:58.248112       1 uploadproxy.go:211] Error connecting to https://cdi-upload-upload-datavolume-file.src.svc/v1alpha1/upload: timed out 5 times connecting to https://cdi-upload-upload-datavolume-file.src.svc/v1alpha1/upload
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).testConnect
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:274
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).proxyUploadRequest
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:210
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).handleUploadRequest
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:188
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).handleUploadRequest-fm
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:140
net/http.HandlerFunc.ServeHTTP
	/usr/lib/golang/src/net/http/server.go:1964
net/http.(*ServeMux).ServeHTTP
	/usr/lib/golang/src/net/http/server.go:2361
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).ServeHTTP
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:144
net/http.serverHandler.ServeHTTP
	/usr/lib/golang/src/net/http/server.go:2741
net/http.(*conn).serve
	/usr/lib/golang/src/net/http/server.go:1847
runtime.goexit
	/usr/lib/golang/src/runtime/asm_amd64.s:1333
I0926 13:11:57.457918       1 uploadproxy.go:179] Received valid token: pvc: upload-datavolume, namespace: src
E0926 13:12:06.504110       1 uploadproxy.go:211] Error connecting to https://cdi-upload-upload-datavolume.src.svc/v1alpha1/upload: timed out 5 times connecting to https://cdi-upload-upload-datavolume.src.svc/v1alpha1/upload
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).testConnect
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:274
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).proxyUploadRequest
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:210
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).handleUploadRequest
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:188
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).handleUploadRequest-fm
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:140
net/http.HandlerFunc.ServeHTTP
	/usr/lib/golang/src/net/http/server.go:1964
net/http.(*ServeMux).ServeHTTP
	/usr/lib/golang/src/net/http/server.go:2361
kubevirt.io/containerized-data-importer/pkg/uploadproxy.(*uploadProxyApp).ServeHTTP
	/go/src/kubevirt.io/containerized-data-importer/pkg/uploadproxy/uploadproxy.go:144
net/http.serverHandler.ServeHTTP
	/usr/lib/golang/src/net/http/server.go:2741
net/http.(*conn).serve
	/usr/lib/golang/src/net/http/server.go:1847
runtime.goexit
	/usr/lib/golang/src/runtime/asm_amd64.s:1333
error: unexpected EOF

Comment 10 Natalie Gavrielov 2019-10-30 14:15:44 UTC
Try to reproduce it with a BM

Comment 11 Natalie Gavrielov 2019-11-28 15:57:27 UTC
1. I wasn't able to get a real BM, so I got a libvirt env, the upload there failed with: "timed out waiting for the condition", upload pod was in ContainerCreating (I suspect rook isn't stable on this env, will try to investigate).

2. I tried testing again on PSI, and when using small qcow2 files (hundreds of MB) - the upload works fine (no errors)

3. When I used a larger image: 6.24 GiB, it uploaded 100% of the file and then issued: Post https://cdi-uploadproxy-openshift-cnv.apps.working.oc4/v1alpha1/upload: EOF.
   In the pod logs it seems to happen when "Validating image", after waiting patiently for another 10 minutes the upload completed and the scratch pvc was removed:

$ virtctl image-upload  --pvc-name=test-pvc --pvc-size=36Gi --image-path=win_16.qcow2 --insecure --block-volume
PVC default/test-pvc created
Waiting for PVC test-pvc upload pod to be running...
Pod now running
Uploading data to https://cdi-uploadproxy-openshift-cnv.apps.working.oc4

 6.24 GiB / 6.24 GiB [=============================================================================================================================================] 100.00% 12m18s

Post https://cdi-uploadproxy-openshift-cnv.apps.working.oc4/v1alpha1/upload: EOF
upload pod log:
$ oc logs -f cdi-upload-test-pvc
I1128 15:13:35.995401       1 uploadserver.go:61] Upload destination: /dev/cdi-block-volume
I1128 15:13:35.996194       1 uploadserver.go:63] Running server on 0.0.0.0:8443
2019/11/28 15:13:43 http: TLS handshake error from 10.131.0.32:54908: EOF
I1128 15:13:43.304401       1 uploadserver.go:244] Content type header is ""
I1128 15:13:43.304451       1 data-processor.go:252] Calculating available size
I1128 15:13:43.324150       1 data-processor.go:256] Checking out block volume size.
I1128 15:13:43.327674       1 data-processor.go:264] Request image size not empty.
I1128 15:13:43.327718       1 data-processor.go:269] Target size 36Gi.
I1128 15:13:43.327962       1 util.go:37] deleting file: /scratch/lost+found
I1128 15:13:43.330062       1 data-processor.go:182] New phase: TransferScratch
I1128 15:13:43.333064       1 util.go:169] Writing data...
I1128 15:25:30.028058       1 data-processor.go:182] New phase: Process
I1128 15:25:30.034149       1 data-processor.go:182] New phase: Convert
I1128 15:25:30.034166       1 data-processor.go:188] Validating image
I1128 15:35:08.057641       1 data-processor.go:182] New phase: Resize
I1128 15:35:08.117081       1 data-processor.go:182] New phase: Complete
I1128 15:35:08.197618       1 util.go:37] deleting file: /scratch/tmpimage
I1128 15:35:08.481357       1 uploadserver.go:263] Wrote data to /dev/cdi-block-volume
I1128 15:35:08.481851       1 uploadserver.go:151] Shutting down http server after successful upload
I1128 15:35:08.982421       1 uploadserver.go:71] UploadServer successfully exited 


@Michael, is it possible for virtctl image-upload to wait until the UploadServer exits?

Comment 12 Michael Henriksen 2019-12-02 14:25:24 UTC
In normal cases (no EOF), virtctl image-upload will complete right before the UploadServer pod exits.  Although possible, I don't think it would be wise to ignore errors such as this and instead wait for the UploadServer pod to exit.  Apparently, this error was benign, but most probably are not.  Then the program will be waiting forever.

Comment 13 Michael Henriksen 2019-12-02 14:26:25 UTC
Approximately how long was "virtctl image-upload" running before the EOF?

Comment 14 Natalie Gavrielov 2019-12-04 11:38:03 UTC
Michael, below the progress bar, it shows 12m18s - so let's add a few more seconds for creating the pvc, in total it will be around 12.5-13 minutes.

Comment 15 Adam Litke 2019-12-10 20:56:09 UTC
@michael, any updates?

Comment 16 Michael Henriksen 2019-12-17 02:28:32 UTC
The issue from https://bugzilla.redhat.com/show_bug.cgi?id=1754920#c5 was addressed in the following PR:

https://github.com/kubevirt/containerized-data-importer/pull/996

What's described in https://bugzilla.redhat.com/show_bug.cgi?id=1754920#c11 is different and I'm not sure what is going on.  Should we make a separate issue for it?

Comment 17 Adam Litke 2019-12-18 14:09:50 UTC
Adam: In response to c16, let's split the issue reported in c11 into a new bug and resolve this one according to the status of PR#996.

Comment 18 Natalie Gavrielov 2020-01-01 18:09:04 UTC
I used the latest CDI 2.2 build, and tried to upload a large image (10G).
The first try resulted in:

$ virtctl image-upload  --pvc-name=test-pvc --pvc-size=36Gi --image-path=win_16.qcow2 --insecure --block-volume
PVC default/test-pvc created
Waiting for PVC test-pvc upload pod to be running...
timed out waiting for the condition

$ oc get pvc
NAME               STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
test-pvc           Bound    pvc-51daa9e8-29d1-4657-b261-307692667513   36Gi       RWO            rook-ceph-block   27m
test-pvc-scratch   Bound    pvc-773b3a70-74d9-4b00-9cf9-45d746a67fe3   36Gi       RWO            rook-ceph-block   27m


The second attempt (didn't change a thing!) resulted with:
$ virtctl image-upload  --pvc-name=test-pvc --pvc-size=36Gi --image-path=win_16.qcow2 --insecure --block-volume
PVC default/test-pvc created
Waiting for PVC test-pvc upload pod to be running...
Pod now running
Uploading data to https://cdi-uploadproxy-openshift-cnv.apps.working.oc4

10.33 GiB [=======================================================================================================================================================================================================================================] 100.00% 57m6s

Post https://cdi-uploadproxy-openshift-cnv.apps.working.oc4/v1alpha1/upload: EOF

So both issues still relevant

Comment 19 Michael Henriksen 2020-01-06 14:46:17 UTC
Natalie are you sure the first issue is still relevant?  Did you see the same stacktrace in uploadproxy log?

The output in comment 18 looks like the second issue.

Comment 21 Adam Litke 2020-01-08 14:00:23 UTC
There are a few different issues making this bug confusing so I am going to summarize our plan to resolve this:

There were two distinct cases mentioned in the original bug description:

* Timed out waiting for the condition:
- Natalie will split this into its own bug and it will be handled with lower priority in 2.3 because it does not reproduce regularly.

* EOF during conversion to RAW
- The upload eventually succeeds despite the confusing user experience (EOF error).
- I'm happy to hear that adjusting haproxy configuration works around the problem but I think it's too heavy-handed to ask CNV users to update their k8s configuration to work around this.
- For 2.2 we will address this with a release note
- For 2.3 we will change the upload state machine so that the conversion happens asynchronously after the client finishes uploading data

Comment 22 Michael Henriksen 2020-01-08 14:09:13 UTC
re: "I think it's too heavy-handed to ask CNV users to update their k8s configuration to work around this"

This is not a k8s configuration, it is a CNV QE test environment configuration.

Comment 23 Adam Litke 2020-01-08 18:00:58 UTC
(In reply to Michael Henriksen from comment #22)
> re: "I think it's too heavy-handed to ask CNV users to update their k8s
> configuration to work around this"
> 
> This is not a k8s configuration, it is a CNV QE test environment
> configuration.

How would a regular user avoid hitting this issue with very large images in their default OCP installation?

Comment 24 Natalie Gavrielov 2020-01-09 08:26:44 UTC
Opened bug #1789259 that addresses the "timed out waiting for the condition" issue

Comment 25 Adam Litke 2020-01-09 19:04:02 UTC
Docs team: Could we add a release note to 2.2 with text similar to the following:

```
When using 'virtctl image-upload' to upload certain large qcow2 VM disk images an EOF error may be reported after all data has been transmitted even though the upload is progressing normally or has completed.  The following command can be used to check the status of the upload on a given PVC:

oc describe pvc <pvc-name> | grep cdi.kubevirt.io/storage.pod.phase
```


The underlying issue will be fixed in 2.3 with an enhancement to the upload process.

Comment 29 Bob Gaydos 2020-01-24 18:22:39 UTC
Moving this to ON_QA.

See https://github.com/openshift/openshift-docs/pull/18680 for links to current test builds.

Comment 30 Bob Gaydos 2020-01-27 16:47:45 UTC
Removing completely from 2.2 Release Notes. It wasn't reported in 2.1 RN so there is no mention of the problem now in 2.2. Correct me if that's not right. 

Thanks,

Bob

Comment 31 Bob Gaydos 2020-01-27 17:31:59 UTC
If QE agrees with comment 30, can QE change state to VERIFIED? Thanks.

Bob

Comment 33 Bob Gaydos 2020-01-27 17:48:21 UTC
@nelly I think this is me confusing the "fixed in 2.2" as the real bug and not this bug.

Sorry, adding this content back for 2.2:

* When running `virtctl image-upload` to upload large VM disk images in `qcow2`
format, an end-of-file (EOF) error may be reported after the data is
transmitted, even though the upload is either progressing normally or completed.
(link:https://bugzilla.redhat.com/show_bug.cgi?id=1754920[*BZ#1754920*])
+
Run the following command to check the status of an upload on a given PVC:
+
----
$ oc describe pvc <pvc-name> | grep cdi.kubevirt.io/storage.pod.phase
----

Comment 34 Natalie Gavrielov 2020-01-30 09:41:33 UTC
Verified the section referring to this issue in https://github.com/openshift/openshift-docs/pull/18680/files

Comment 35 Dan Kenigsberg 2020-01-30 20:02:40 UTC
closing this bug manually; it should have been included in

https://access.redhat.com/errata/RHEA-2020:0307


Note You need to log in before you can comment on or make changes to this bug.