Bug 1554226 - [RFE] support ticket extension using imageio-proxy
Summary: [RFE] support ticket extension using imageio-proxy
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-imageio
Classification: oVirt
Component: Proxy
Version: 1.1.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ovirt-4.4.1
: ---
Assignee: Nir Soffer
QA Contact: Ilan Zuckerman
URL:
Whiteboard:
: 1540111 (view as bug list)
Depends On: 1559472
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-03-12 06:32 UTC by Vladimir
Modified: 2020-07-08 08:26 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-08 08:26:41 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.3?
izuckerm: testing_plan_complete+
ylavi: planning_ack+
rule-engine: devel_ack?
aefrat: testing_ack+


Attachments (Terms of Use)
Upload1 (173.01 KB, image/png)
2018-03-12 10:23 UTC, Vladimir
no flags Details
Upload2 (166.49 KB, image/png)
2018-03-12 10:24 UTC, Vladimir
no flags Details
Upload3 (171.60 KB, image/png)
2018-03-12 10:24 UTC, Vladimir
no flags Details
Upload4 (169.84 KB, image/png)
2018-03-12 10:25 UTC, Vladimir
no flags Details
Log (13.66 KB, text/plain)
2018-03-12 10:26 UTC, Vladimir
no flags Details
OVIRT_LOG.rar (2.18 MB, application/x-rar)
2018-03-14 11:22 UTC, Vladimir
no flags Details
engine-config -a -g.txt (24.59 KB, text/plain)
2018-03-15 07:52 UTC, Vladimir
no flags Details
Log debug google chrome (3.73 KB, text/plain)
2018-03-15 07:53 UTC, Vladimir
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 89400 0 master ABANDONED engine: increase ImageTransferClientTicketValidityInSeconds value 2020-06-15 09:15:02 UTC

Description Vladimir 2018-03-12 06:32:05 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.When importing a disk image in the format qcow2, more than 16 GB, the session is closed.


Actual results:


Expected results:


Additional info:

Comment 1 Tal Nisan 2018-03-12 09:08:27 UTC
Hi Valdimir,
You are referring to disk upload I assume?
Also can you please attach Engine and VDSM logs please?

Comment 2 Vladimir 2018-03-12 10:23:24 UTC
Created attachment 1407138 [details]
Upload1

Process upload

Comment 3 Vladimir 2018-03-12 10:24:22 UTC
Created attachment 1407139 [details]
Upload2

Process upload

Comment 4 Vladimir 2018-03-12 10:24:56 UTC
Created attachment 1407140 [details]
Upload3

process upload

Comment 5 Vladimir 2018-03-12 10:25:21 UTC
Created attachment 1407141 [details]
Upload4

Process upload

Comment 6 Vladimir 2018-03-12 10:26:17 UTC
Created attachment 1407142 [details]
Log

Comment 7 Vladimir 2018-03-12 10:29:42 UTC
Software Version:4.2.1.7-1.el7.centos

I'm doing importing a disk image in the qcow2 format which was converted from the vmdk format. (VirtualBox).

On the attached images you can see that now loaded 14GB and stopped.

Comment 8 Daniel Erez 2018-03-14 10:01:20 UTC
Hi Vladimir,

According to the log[1], there was some network error while uploading. Is it consistently failing for this specific file? Can you please also attach the image-proxy.log, daemon.log, ui.log, and if available the console log from the browser. Also, you can try to reduce the upload chunk size to accommodate to slower upload bandwidth, using: 'engine-config -s UploadImageChunkSizeKB=new_value'

[1]
2018-03-12 14:16:45,779+04 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-45) [6bf2cb6a-5445-4b73-86a4-9d6729df6f9e] EVENT_ID: UPLOAD_IMAGE_NETWORK_ERROR(1,038), Unable to upload image to disk ef2b417e-fe1d-4618-9c36-a91ae428fb2a due to a network error. Make sure ovirt-imageio-proxy service is installed and configured, and ovirt-engine's certificate is registered as a valid CA in the browser

Comment 9 Vladimir 2018-03-14 11:22:19 UTC
Created attachment 1407928 [details]
OVIRT_LOG.rar

Comment 10 Vladimir 2018-03-14 11:23:46 UTC
Hi.
Decreased the 'UploadImageChunkSizeKB' to 1024.
The download reached 14400 MB and stopped. (Paused by System)
Log in the attachment.(rar archive 'OVIRT_LOG.rar')

Comment 11 Daniel Erez 2018-03-14 13:54:10 UTC
(In reply to Vladimir from comment #10)
> Hi.
> Decreased the 'UploadImageChunkSizeKB' to 1024.
> The download reached 14400 MB and stopped. (Paused by System)
> Log in the attachment.(rar archive 'OVIRT_LOG.rar')

Seems that the ticket validity expired after an hour [1].
As the default value of ImageTransferClientTicketValidityInSeconds is 36000 seconds.

Try increasing its value (and restart the engine afterwards), e.g.:
# engine-config -s ImageTransferClientTicketValidityInSeconds=360000


[1]
(Thread-765) INFO 2018-03-14 14:09:51,102 web:102:web:(log_finish) FINISH [10.84.209.158] PUT /tickets/: [200] 0 (0.03s)
(Thread-766) INFO 2018-03-14 14:09:52,610 web:95:web:(log_start) START [10.84.209.198] OPTIONS /images/1018d76e-7a51-4462-80d6-b6648ce6fba2
...
(Thread-923) INFO 2018-03-14 15:10:14,885 web:95:web:(log_start) START [10.84.209.198] PUT /images/1018d76e-7a51-4462-80d6-b6648ce6fba2
(Thread-923) WARNING 2018-03-14 15:10:14,885 web:112:web:(log_error) ERROR [10.84.209.198] PUT /images/1018d76e-7a51-4462-80d6-b6648ce6fba2: [401] Not authorized (expired ticket) (0.00s)

Comment 12 Vladimir 2018-03-15 07:51:30 UTC
Hi
parameter "ImageTransferClientTicketValidityInSeconds" is not available in version 4.2.1.7-1

All the parameters in the file 'engine-config -a -g.txt'

If the 'UploadImageXhrTimeoutInSeconds = 360000' parameter is changed, the file is not downloaded until the end.
file size approx. 70 GB

Comment 13 Vladimir 2018-03-15 07:52:53 UTC
Created attachment 1408357 [details]
engine-config -a -g.txt

Comment 14 Vladimir 2018-03-15 07:53:25 UTC
Created attachment 1408358 [details]
Log debug google chrome

Comment 15 Daniel Erez 2018-03-15 07:59:23 UTC
(In reply to Vladimir from comment #12)
> Hi
> parameter "ImageTransferClientTicketValidityInSeconds" is not available in
> version 4.2.1.7-1

Right, it was exposed to engine-config in build 4.2.2.
Can you please try to update? Or, you can also change it manually in vdc_options table.

> 
> All the parameters in the file 'engine-config -a -g.txt'
> 
> If the 'UploadImageXhrTimeoutInSeconds = 360000' parameter is changed, the
> file is not downloaded until the end.
> file size approx. 70 GB

Comment 16 Vladimir 2018-03-23 07:27:49 UTC
After updating to the version "4.2.2.2-1.el7" the download was successful
Thanks for the help.

Comment 17 Daniel Erez 2018-03-25 08:40:50 UTC
(In reply to Vladimir from comment #16)
> After updating to the version "4.2.2.2-1.el7" the download was successful
> Thanks for the help.

The current (version 4.2) value of ImageTransferClientTicketValidityInSeconds is 10 hours, which should be enough as a default value. Changing this bug to an rfe for supporting ticket extension in imageio-proxy.

Comment 18 Doron Fediuck 2018-05-29 13:51:20 UTC
*** Bug 1540111 has been marked as a duplicate of this bug. ***

Comment 19 Red Hat Bugzilla Rules Engine 2018-05-29 14:35:24 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 20 Red Hat Bugzilla Rules Engine 2018-05-29 14:35:24 UTC
This request has been proposed for two releases. This is invalid flag usage. The ovirt-future release flag has been cleared. If you wish to change the release flag, you must clear one release flag and then set the other release flag to ?.

Comment 21 Sandro Bonazzola 2019-01-28 09:43:42 UTC
This bug has not been marked as blocker for oVirt 4.3.0.
Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1.

Comment 22 Michal Skrivanek 2020-03-18 15:47:10 UTC
This bug didn't get any attention for a while, we didn't have the capacity to make any progress. If you deeply care about it or want to work on it please assign/target accordingly

Comment 23 Michal Skrivanek 2020-03-18 15:51:56 UTC
This bug didn't get any attention for a while, we didn't have the capacity to make any progress. If you deeply care about it or want to work on it please assign/target accordingly

Comment 24 Michal Skrivanek 2020-04-01 14:48:10 UTC
ok, closing. Please reopen if still relevant/you want to work on it.

Comment 25 Michal Skrivanek 2020-04-01 14:51:25 UTC
ok, closing. Please reopen if still relevant/you want to work on it.

Comment 26 Nir Soffer 2020-04-26 00:52:25 UTC
This is implemented in ovirt-imageio 2.0, since we use the daemon now, and it already
supported ticket extension.

Comment 27 Ilan Zuckerman 2020-06-01 07:56:12 UTC
Nir, from looking at the 'ImageTransferClientTicketValidityInSeconds' on engine release rhv-4.4.1-2
I can see that its set to only 300 instead of 10 hours as described here:

[root@storage-ge-05 ~]# engine-config -a | grep ImageTransferClientTicketValidityInSeconds
ImageTransferClientTicketValidityInSeconds: 300 version: general

Please relate to this, and also i would like to know whether we should also upload large disk to see if it times out before the value of 'ImageTransferClientTicketValidityInSeconds' is reached?

Comment 28 Nir Soffer 2020-06-03 12:30:13 UTC
The proxy did not have a mechanism for extending ticket lifetime, so we used
a very large timeout (10 hours). The daemon extends tickets automatically by
300 seconds on every request, so transfer should never time out if the client
is active.

Engine is also extending a ticket based on transfer inactivity_timeout. If
a client does not send any request after inactivity_timeout seconds the
ticket will expire.

You can test this by uploading large preallocated image:

    $ dd if=/dev/zero bs=1M count=$((1024*32)) of=big.raw

Created qcow2 image:

    $ qemu-img convert -p -f raw -O qcow2 big.raw big.qcow2

Upload the disks using upload_disk.py or using the UI:

    $ python3 upload_disk.py ... big.raw
    $ python3 upload_disk.py ... big.qcow2

If uploads take less then 300 seconds, use bigger images.

Comment 29 Ilan Zuckerman 2020-06-04 06:08:13 UTC
Nir, i followed your steps.

- created 30G file with DD
- Uploaded it. It took more that 300 seconds, Upload successful.
- next, i converted the file to qcow with qemu-img just as you suggested.
  But when i tried to upload it to file / block domain (i tried both), got this exception:

Cannot add Virtual Disk. Disk configuration (COW Preallocated backup-None) is incompatible with the storage domain type

[root@storage-ge5-vdsm1 ~]#  python3 /usr/share/doc/python3-ovirt-engine-sdk4/examples/upload_disk.py big.qcow2 --sd-name=iscsi_0 --engine-url https://storage-ge-05xxx --username xxx -c /root/ca.pem
Checking image...
Image format: qcow2
Disk format: cow
Disk content type: data
Disk provisioned size: 32212254720
Disk initial size: 877658112
Disk name: big.qcow2
Connecting...
Password: 
Creating disk...
Traceback (most recent call last):
  File "/usr/share/doc/python3-ovirt-engine-sdk4/examples/upload_disk.py", line 233, in <module>
    name=args.sd_name
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/services.py", line 7697, in add
    return self._internal_add(disk, headers, query, wait)
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 232, in _internal_add
    return future.wait() if wait else future
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 55, in wait
    return self._code(response)
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 229, in callback
    self._check_fault(response)
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 132, in _check_fault
    self._raise_error(response, body)
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 118, in _raise_error
    raise error
ovirtsdk4.Error: Fault reason is "Operation Failed". Fault detail is "[Cannot add Virtual Disk. Disk configuration (COW Preallocated backup-None) is incompatible with the storage domain type.]". HTTP response code is 409

Comment 30 Avihai 2020-06-04 07:20:34 UTC
> "[Cannot add Virtual Disk. Disk configuration (COW Preallocated backup-None)

Indeed looking here[1] this combination "COW Preallocated backup-None" is not supported neither in block/file SD's.
Only when the incremental backup is enabled on the disk this combination is allowed.

Nir, is there another way we can test this? enabling incremental backup on the disk perhapse?


[1] https://www.ovirt.org/documentation/incremental-backup-guide/incremental-backup-guide.html

Comment 31 Nir Soffer 2020-06-04 12:47:00 UTC
(In reply to Avihai from comment #30)
> > "[Cannot add Virtual Disk. Disk configuration (COW Preallocated backup-None)
> Nir, is there another way we can test this? enabling incremental backup on
> the disk perhapse?

Yes, upload as sparse.

In the next build you will be able to upload as cow-prellocated by specifying
bakcup="incremental".

Comment 32 Ilan Zuckerman 2020-06-07 07:37:09 UTC
 
> Yes, upload as sparse.
> 
> In the next build you will be able to upload as cow-prellocated by specifying
> bakcup="incremental".

python3 upload_disk.py ... big.raw
Uploaded it. It took more that 300 seconds, Upload successful.

python3 upload_disk.py ... big.qcow2 --disk-sparse
I couldnt create big enough qcow disk which would take more that 300 sec to upload. Even converted disk from 60GB raw disk takes only ~2 minutes to upload

If you think this is not sufficient for verifying this, please tell, and then i can try to find a 4.4 host located in tlv lab in order to get much lower upload speeds. Then, it will for sure take more than 5 minutes...

Comment 33 Nir Soffer 2020-06-07 13:18:36 UTC
(In reply to Ilan Zuckerman from comment #32)
> python3 upload_disk.py ... big.qcow2 --disk-sparse
> I couldnt create big enough qcow disk which would take more that 300 sec to
> upload. Even converted disk from 60GB raw disk takes only ~2 minutes to
> upload

The issue is qemu is doing automatic zero detection now:

$ dd if=/dev/zero bs=1M count=$((6*1024)) of=/var/tmp/zeroes-6g.raw

$ qemu-img convert -p -f raw -O qcow2 /var/tmp/zeroes-6g.raw /var/tmp/zeroes-6g.qcow2

 ls -lhs /var/tmp/zeroes-6g.*
129M -rw-r--r--. 1 nsoffer nsoffer 129M Jun  7 16:02 /var/tmp/zeroes-6g.qcow2
6.1G -rw-rw-r--. 1 nsoffer nsoffer 6.0G Jun  7 16:02 /var/tmp/zeroes-6g.raw

I'm not sure why qemu does not create a completely empty image if it
does detect zeros, maybe this is a bug in qemu. Anyway we upload only
the data so this is not a good test for long upload.

We need to test again like this:

$ dd if=/dev/zero bs=1M count=$((6*1024)) | tr "\0" "\1" > /var/tmp/ones-6g.raw

$ qemu-img convert -p -f raw -O qcow2 /var/tmp/ones-6g.raw /var/tmp/ones-6g.qcow2

$ ls -lhs /var/tmp/ones-6g.*
6.1G -rw-r--r--. 1 nsoffer nsoffer 6.1G Jun  7 16:09 /var/tmp/ones-6g.qcow2
6.1G -rw-rw-r--. 1 nsoffer nsoffer 6.0G Jun  7 16:08 /var/tmp/ones-6g.raw

Now uploading the qcow2 image will be slow as the raw image.

Comment 34 Nir Soffer 2020-06-07 13:37:05 UTC
(In reply to Ilan Zuckerman from comment #29)
> [root@storage-ge5-vdsm1 ~]#  python3
> /usr/share/doc/python3-ovirt-engine-sdk4/examples/upload_disk.py big.qcow2
> --sd-name=iscsi_0 --engine-url https://storage-ge-05xxx --username xxx -c
> /root/ca.pem

This uploads directly to host, but this bug is about the proxy.

You need to use --use-proxy in upload_disk.py, or upload via the UI.

Comment 37 Sandro Bonazzola 2020-07-08 08:26:41 UTC
This bugzilla is included in oVirt 4.4.1 release, published on July 8th 2020.

Since the problem described in this bug report should be resolved in oVirt 4.4.1 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.