Description of problem: Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1.When importing a disk image in the format qcow2, more than 16 GB, the session is closed. Actual results: Expected results: Additional info:
Hi Valdimir, You are referring to disk upload I assume? Also can you please attach Engine and VDSM logs please?
Created attachment 1407138 [details] Upload1 Process upload
Created attachment 1407139 [details] Upload2 Process upload
Created attachment 1407140 [details] Upload3 process upload
Created attachment 1407141 [details] Upload4 Process upload
Created attachment 1407142 [details] Log
Software Version:4.2.1.7-1.el7.centos I'm doing importing a disk image in the qcow2 format which was converted from the vmdk format. (VirtualBox). On the attached images you can see that now loaded 14GB and stopped.
Hi Vladimir, According to the log[1], there was some network error while uploading. Is it consistently failing for this specific file? Can you please also attach the image-proxy.log, daemon.log, ui.log, and if available the console log from the browser. Also, you can try to reduce the upload chunk size to accommodate to slower upload bandwidth, using: 'engine-config -s UploadImageChunkSizeKB=new_value' [1] 2018-03-12 14:16:45,779+04 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-45) [6bf2cb6a-5445-4b73-86a4-9d6729df6f9e] EVENT_ID: UPLOAD_IMAGE_NETWORK_ERROR(1,038), Unable to upload image to disk ef2b417e-fe1d-4618-9c36-a91ae428fb2a due to a network error. Make sure ovirt-imageio-proxy service is installed and configured, and ovirt-engine's certificate is registered as a valid CA in the browser
Created attachment 1407928 [details] OVIRT_LOG.rar
Hi. Decreased the 'UploadImageChunkSizeKB' to 1024. The download reached 14400 MB and stopped. (Paused by System) Log in the attachment.(rar archive 'OVIRT_LOG.rar')
(In reply to Vladimir from comment #10) > Hi. > Decreased the 'UploadImageChunkSizeKB' to 1024. > The download reached 14400 MB and stopped. (Paused by System) > Log in the attachment.(rar archive 'OVIRT_LOG.rar') Seems that the ticket validity expired after an hour [1]. As the default value of ImageTransferClientTicketValidityInSeconds is 36000 seconds. Try increasing its value (and restart the engine afterwards), e.g.: # engine-config -s ImageTransferClientTicketValidityInSeconds=360000 [1] (Thread-765) INFO 2018-03-14 14:09:51,102 web:102:web:(log_finish) FINISH [10.84.209.158] PUT /tickets/: [200] 0 (0.03s) (Thread-766) INFO 2018-03-14 14:09:52,610 web:95:web:(log_start) START [10.84.209.198] OPTIONS /images/1018d76e-7a51-4462-80d6-b6648ce6fba2 ... (Thread-923) INFO 2018-03-14 15:10:14,885 web:95:web:(log_start) START [10.84.209.198] PUT /images/1018d76e-7a51-4462-80d6-b6648ce6fba2 (Thread-923) WARNING 2018-03-14 15:10:14,885 web:112:web:(log_error) ERROR [10.84.209.198] PUT /images/1018d76e-7a51-4462-80d6-b6648ce6fba2: [401] Not authorized (expired ticket) (0.00s)
Hi parameter "ImageTransferClientTicketValidityInSeconds" is not available in version 4.2.1.7-1 All the parameters in the file 'engine-config -a -g.txt' If the 'UploadImageXhrTimeoutInSeconds = 360000' parameter is changed, the file is not downloaded until the end. file size approx. 70 GB
Created attachment 1408357 [details] engine-config -a -g.txt
Created attachment 1408358 [details] Log debug google chrome
(In reply to Vladimir from comment #12) > Hi > parameter "ImageTransferClientTicketValidityInSeconds" is not available in > version 4.2.1.7-1 Right, it was exposed to engine-config in build 4.2.2. Can you please try to update? Or, you can also change it manually in vdc_options table. > > All the parameters in the file 'engine-config -a -g.txt' > > If the 'UploadImageXhrTimeoutInSeconds = 360000' parameter is changed, the > file is not downloaded until the end. > file size approx. 70 GB
After updating to the version "4.2.2.2-1.el7" the download was successful Thanks for the help.
(In reply to Vladimir from comment #16) > After updating to the version "4.2.2.2-1.el7" the download was successful > Thanks for the help. The current (version 4.2) value of ImageTransferClientTicketValidityInSeconds is 10 hours, which should be enough as a default value. Changing this bug to an rfe for supporting ticket extension in imageio-proxy.
*** Bug 1540111 has been marked as a duplicate of this bug. ***
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
This request has been proposed for two releases. This is invalid flag usage. The ovirt-future release flag has been cleared. If you wish to change the release flag, you must clear one release flag and then set the other release flag to ?.
This bug has not been marked as blocker for oVirt 4.3.0. Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1.
This bug didn't get any attention for a while, we didn't have the capacity to make any progress. If you deeply care about it or want to work on it please assign/target accordingly
ok, closing. Please reopen if still relevant/you want to work on it.
This is implemented in ovirt-imageio 2.0, since we use the daemon now, and it already supported ticket extension.
Nir, from looking at the 'ImageTransferClientTicketValidityInSeconds' on engine release rhv-4.4.1-2 I can see that its set to only 300 instead of 10 hours as described here: [root@storage-ge-05 ~]# engine-config -a | grep ImageTransferClientTicketValidityInSeconds ImageTransferClientTicketValidityInSeconds: 300 version: general Please relate to this, and also i would like to know whether we should also upload large disk to see if it times out before the value of 'ImageTransferClientTicketValidityInSeconds' is reached?
The proxy did not have a mechanism for extending ticket lifetime, so we used a very large timeout (10 hours). The daemon extends tickets automatically by 300 seconds on every request, so transfer should never time out if the client is active. Engine is also extending a ticket based on transfer inactivity_timeout. If a client does not send any request after inactivity_timeout seconds the ticket will expire. You can test this by uploading large preallocated image: $ dd if=/dev/zero bs=1M count=$((1024*32)) of=big.raw Created qcow2 image: $ qemu-img convert -p -f raw -O qcow2 big.raw big.qcow2 Upload the disks using upload_disk.py or using the UI: $ python3 upload_disk.py ... big.raw $ python3 upload_disk.py ... big.qcow2 If uploads take less then 300 seconds, use bigger images.
Nir, i followed your steps. - created 30G file with DD - Uploaded it. It took more that 300 seconds, Upload successful. - next, i converted the file to qcow with qemu-img just as you suggested. But when i tried to upload it to file / block domain (i tried both), got this exception: Cannot add Virtual Disk. Disk configuration (COW Preallocated backup-None) is incompatible with the storage domain type [root@storage-ge5-vdsm1 ~]# python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/upload_disk.py big.qcow2 --sd-name=iscsi_0 --engine-url https://storage-ge-05xxx --username xxx -c /root/ca.pem Checking image... Image format: qcow2 Disk format: cow Disk content type: data Disk provisioned size: 32212254720 Disk initial size: 877658112 Disk name: big.qcow2 Connecting... Password: Creating disk... Traceback (most recent call last): File "/usr/share/doc/python3-ovirt-engine-sdk4/examples/upload_disk.py", line 233, in <module> name=args.sd_name File "/usr/lib64/python3.6/site-packages/ovirtsdk4/services.py", line 7697, in add return self._internal_add(disk, headers, query, wait) File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 232, in _internal_add return future.wait() if wait else future File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 55, in wait return self._code(response) File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 229, in callback self._check_fault(response) File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 132, in _check_fault self._raise_error(response, body) File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 118, in _raise_error raise error ovirtsdk4.Error: Fault reason is "Operation Failed". Fault detail is "[Cannot add Virtual Disk. Disk configuration (COW Preallocated backup-None) is incompatible with the storage domain type.]". HTTP response code is 409
> "[Cannot add Virtual Disk. Disk configuration (COW Preallocated backup-None) Indeed looking here[1] this combination "COW Preallocated backup-None" is not supported neither in block/file SD's. Only when the incremental backup is enabled on the disk this combination is allowed. Nir, is there another way we can test this? enabling incremental backup on the disk perhapse? [1] https://www.ovirt.org/documentation/incremental-backup-guide/incremental-backup-guide.html
(In reply to Avihai from comment #30) > > "[Cannot add Virtual Disk. Disk configuration (COW Preallocated backup-None) > Nir, is there another way we can test this? enabling incremental backup on > the disk perhapse? Yes, upload as sparse. In the next build you will be able to upload as cow-prellocated by specifying bakcup="incremental".
> Yes, upload as sparse. > > In the next build you will be able to upload as cow-prellocated by specifying > bakcup="incremental". python3 upload_disk.py ... big.raw Uploaded it. It took more that 300 seconds, Upload successful. python3 upload_disk.py ... big.qcow2 --disk-sparse I couldnt create big enough qcow disk which would take more that 300 sec to upload. Even converted disk from 60GB raw disk takes only ~2 minutes to upload If you think this is not sufficient for verifying this, please tell, and then i can try to find a 4.4 host located in tlv lab in order to get much lower upload speeds. Then, it will for sure take more than 5 minutes...
(In reply to Ilan Zuckerman from comment #32) > python3 upload_disk.py ... big.qcow2 --disk-sparse > I couldnt create big enough qcow disk which would take more that 300 sec to > upload. Even converted disk from 60GB raw disk takes only ~2 minutes to > upload The issue is qemu is doing automatic zero detection now: $ dd if=/dev/zero bs=1M count=$((6*1024)) of=/var/tmp/zeroes-6g.raw $ qemu-img convert -p -f raw -O qcow2 /var/tmp/zeroes-6g.raw /var/tmp/zeroes-6g.qcow2 ls -lhs /var/tmp/zeroes-6g.* 129M -rw-r--r--. 1 nsoffer nsoffer 129M Jun 7 16:02 /var/tmp/zeroes-6g.qcow2 6.1G -rw-rw-r--. 1 nsoffer nsoffer 6.0G Jun 7 16:02 /var/tmp/zeroes-6g.raw I'm not sure why qemu does not create a completely empty image if it does detect zeros, maybe this is a bug in qemu. Anyway we upload only the data so this is not a good test for long upload. We need to test again like this: $ dd if=/dev/zero bs=1M count=$((6*1024)) | tr "\0" "\1" > /var/tmp/ones-6g.raw $ qemu-img convert -p -f raw -O qcow2 /var/tmp/ones-6g.raw /var/tmp/ones-6g.qcow2 $ ls -lhs /var/tmp/ones-6g.* 6.1G -rw-r--r--. 1 nsoffer nsoffer 6.1G Jun 7 16:09 /var/tmp/ones-6g.qcow2 6.1G -rw-rw-r--. 1 nsoffer nsoffer 6.0G Jun 7 16:08 /var/tmp/ones-6g.raw Now uploading the qcow2 image will be slow as the raw image.
(In reply to Ilan Zuckerman from comment #29) > [root@storage-ge5-vdsm1 ~]# python3 > /usr/share/doc/python3-ovirt-engine-sdk4/examples/upload_disk.py big.qcow2 > --sd-name=iscsi_0 --engine-url https://storage-ge-05xxx --username xxx -c > /root/ca.pem This uploads directly to host, but this bug is about the proxy. You need to use --use-proxy in upload_disk.py, or upload via the UI.
This bugzilla is included in oVirt 4.4.1 release, published on July 8th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.1 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.