+++ This bug was initially created as a clone of Bug #1527050 +++ In current release (<=1.3.0), read and writes are not concurrent. Here is example upload: Client ------ $ ./upload_disk.py --direct /dev/shm/10G.raw Uploaded 10.00g in 37.32 seconds (274.40m/s) Server ------ Operation stats: <Clock(total=37.32, read=20.55, write=14.41, sync=0.00)> write: 710 MiB/s read: 498 MiB/s total: 274 MiB/s Upload throughput is limited by the naive read-write loop - when we read from the socket, we don't write to storage, and we write to storage we don't read data from the socket. We have patches in review showing good improvement: https://gerrit.ovirt.org/#/q/topic:cio+project:ovirt-imageio+is:open Client ------ $ ./upload_disk.py --direct /dev/shm/10G.raw Uploaded 10.00g in 23.29 seconds (439.73m/s) Server ------ Operation stats: <Clock(total=23.28, read=23.21, write=14.56, sync=0.00)> write: 703 MiB/s read: 441 MiB/s total: 439 MiB/s Upload throughput is limited by reading from SSL socket. If we could read faster from the SSL socket, we can upload in the same rate we can write to storage. We probably have the same problem when downloading images. We probably have the same issue in the proxy. This effects importing vms (virt-v2v, backup and restore, or user uploading and downloading images for other reasons.
We have patches for improving upload throughput which show 50-60% improvement in the upload_disk.py example, however when using lot of small requrests like virt-v2v is using, these changes do not help. Also these changes are not compatible yet with fast zero, which does give great improvement to virt-v2v, so we will need more work to support this. To support concurent I/O that will be useful to all use cases, we need to do something like this: 1. Add the concept of a session - we need to be able to detect the start of an upload or download. Currently we have a add ticket and remove ticket event, but since a ticket supports multiple clients, we cannot use it for sessions. 2. When a session starts, open the underlying file, and start a worker thread for this session 3. When the user send a PUT or PATCH/zero request, submit the request for processing in the worker thread queue 4. If the request does not require flushing, or need to return data from the image, return ack to the user, so it can send the next request - while the worker is processing the request. 5. When the worker queue is full, the http thread should wait until there is room for handling new requests. This means we reach the maximum concurrency we want. This is not an easy change, and lot of work, so I think we should defer this to 4.3. Supporting fast zero and nbd is much more important and needed also for incremental backup.
Please Provide Validation instructions (Like - Do we need to run V2V or some upload command would be enough)
There is nothing to test yet, so no instructions. We will update the bug if and when we work on this.
Nir, The bug is in post meaning the implementation finished. Should it be moved back to assigned?
POST means that some patches were submitted. This does not mean that development finished. Since we need to rework the patches, and we don't work on this now, moving to NEW.
This bug has not been marked as blocker for oVirt 4.3.0. Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1.
We have pending patches adding support for multiple connections: https://gerrit.ovirt.org/c/105949/ I think this is the best only feasible solution. If virt-v2v should switch to use imageio client.upload(), supporting multiple connections.
I updated the old patches. I think this can be ready for 4.4.2.
The problem with older imageio is using synchronous I/O and single thread per transfer. This limits the possible throughput when transferring single disk. When transferring multiple disks we get good overall throughput even with older imageio server. We considered supporting pipelining in the server, which makes the I/O kind of asynchronous. This is tricky to get right, and typically not supported by http clients (e.g. python http.client used in virt-v2v) and would not give enough concurrency (see comment 1). Instead we added support for multiple connections in the server side, and we are using multiple connections on client side to improve upload and download throughput when transferring a single or few disks. With this change, imageio client uses 4 connections per transfer, with 4 threads on both client and server side to transfer image data (like "qemu-img convert -W" option). This makes single transfer fast. I tested this change in f17-h31-000-1029p.rdu2.scalelab - both single upload and multiple uploads. transfers size time rate ----------------------------------------------------- 1 100 GiB 89.7 seconds 1.18 GiB/s 10 1000 GiB 895.0 seconds 1.11 GiB/s So with this change we upload and download single image efficiently and we reach the maximum possible throughput with single transfer. Unfortunately virt-v2v does not use imageio client for upload and and it cannot use the client without a major rewrite of the rhv plugin, so this change in imageio is not expected to improve anything in current virt-v2v. However we made other changes in imageio server during 4.4 development cycle, so virt-v2v imports should be faster with 4.4 even with current virt-v2v code. The most interesting change is fixing the way we manage buffers, we have this bug 1836858 to testing it. Also there were some changes in nbdkit that can improve throughput like: https://github.com/libguestfs/nbdkit/commit/39701410487c7e58c2829aeb44556f7dc9eacba9 I did quick test of virt-v2v import from local 100g disk with 48 GiB of data. for n in $(seq -w 10); do virt-v2v ... > v2v-10/v2v-$n.log 2>&1 & done $ grep Finishing v2v-10/*.log v2v-10/v2v.01.log:[ 895.3] Finishing off v2v-10/v2v.02.log:[ 893.2] Finishing off v2v-10/v2v.03.log:[ 892.4] Finishing off v2v-10/v2v.04.log:[ 880.2] Finishing off v2v-10/v2v.05.log:[ 882.4] Finishing off v2v-10/v2v.06.log:[ 906.6] Finishing off v2v-10/v2v.07.log:[ 927.1] Finishing off v2v-10/v2v.08.log:[ 881.0] Finishing off v2v-10/v2v.09.log:[ 896.4] Finishing off v2v-10/v2v.10.log:[ 897.2] Finishing off In this test we imported 1000 GiB of data in 927 seconds (1.07 GiB/s). This is the total time including creating the disk and vm in RHV, and doing the initial conversion. If we look in a single virt-v2v log: [ 0.7] Opening the source -i disk ./fedora-31-100g-50p.raw [ 0.8] Creating an overlay to protect the source from being modified [ 0.8] Opening the overlay [ 4.2] Inspecting the overlay [ 8.5] Checking for sufficient free disk space in the guest [ 8.5] Estimating space required on target for each disk [ 8.5] Converting Fedora 31 (Thirty One) to run on KVM virt-v2v: warning: could not determine a way to update the configuration of Grub2 virt-v2v: warning: /files/boot/grub2/device.map/hd0 references unknown device "vda". You may have to fix this entry manually after conversion. virt-v2v: This guest has virtio drivers installed. [ 59.8] Mapping filesystem data to avoid copying unused and blank areas [ 60.6] Closing the overlay [ 60.7] Assigning disks to buses [ 60.7] Checking if the guest needs BIOS or UEFI to boot [ 60.7] Initializing the target -o rhv-upload -oa preallocated -oc https://rhev-red-03.rdu2.scalelab.redhat.com/ovirt-engine/api -op password -os L0_Group_0_SD [ 62.0] Copying disk 1/1 to qemu URI json:{ "file.driver": "nbd", "file.path": "/var/tmp/rhvupload.4cWCCs/nbdkit0.sock", "file.export": "/" } (raw) (0.00/100%)^M (1.00/100%)^M ... (100.00/100%) [ 904.8] Creating output metadata [ 906.6] Finishing off We can see that the upload started at 62.0 and ended at 904.8 so upload time was 842.8 seconds (121.49 MiB/s). This time includes the time to create a disk. The actual transfer time can be extracted from imageio connection stats but I did not checked this. When I tested the same flow few years ago I saw much lower total throughput when using image with 33% utilization: https://bugzilla.redhat.com/show_bug.cgi?id=1615144#c3 Since we have bug 1836858 for testing virt-v2v, I think this bug should focus on testing upload and download using upload_disk.py and download_disk.py scripts from ovirt-engine-sdk. These tests are mainly relevant to backup vendors which will use the same APIs. I think we should test: - images with 30% 50%, 70% utilization, to test how image utilization affects throughput. If we want to limit the testing, 50% utilization looks like good choice for standard tests. - download single disk to qcow2 format - upload single qcow2 image to qcow2 disk - 10 concurrent downloads - 10 concurrent uploads - iSCSI and FC storage - local and remote transfer - proxy transfer (add --use-proxy in download/upload commands) Notes: - remote transfers use the management network, so we need to use fast 10g network. - testing downloads should use fast NFS server over 10g network. Testing download to slow local disk does not test imageio but the local disk. - imageio client supports transfer from raw/qcow2 to raw/qcow2, so we have 4 possible combinations (raw to raw, qcow2 to qcow2, raw to qcow2, qcow2 to raw). Since upload and download are important in backup and restore context, and in this context qcow2 is the most useful format, I think we should focus only on qcow2 format. - downloading raw disks on block storage is not efficient since we don't have any information on image sparseness. We can test this as the worst case scenario. - concurrent uploads is less important to test for in backup context, since we we don't expect mass restore from backup. We do expect mass backup operations daily. So we should focus on download tests. Concurrent uploads are relevant to virt-v2v. - We should test both iSCSI and FC storage, but if we want to limit testing, we can focus on FC storage since our most important users use FC and we want to make sure we have good performance on best storage. - proxy transfer is not efficient and not recommended but it is possible that some users will have to use the proxy for backup. I think we need to test at least single transfer with the proxy. How to test download: $ /usr/share/doc/python3-ovirt-engine-sdk4/examples/download_disk.py \ --engine-url https://my.engine/ \ --username admin@internal \ --password-file password \ --cafile /etc/pki/vdsm/cets/cacert.pem \ --format qcow2 \ {disk-uuid} \ /mnt/backup-media/{disk-uuid}.qcow2 How to test uploads: $ /usr/share/doc/python3-ovirt-engine-sdk4/examples/upload_disk.py \ --engine-url https://my.engine/ \ --username admin@internal \ --password-file password \ --cafile /etc/pki/vdsm/cets/cacert.pem \ --disk-format qcow2 \ --disk-sparse \ --sd-name {my.storage} \ /mnt/backup-media/{disk-uuid}.qcow2
"imageio-client" is https://github.com/danielerez/imageio-client ? It seems as if it only supports uploads from a local file although it's a bit hard to tell from the source. What we would really like would be an NBD endpoint.
(In reply to Richard W.M. Jones from comment #13) > "imageio-client" is https://github.com/danielerez/imageio-client ? No, this is an early version of the java client used by engine to control ovirt-imageio service on engine host. > It seems as if it only supports uploads from a local file although > it's a bit hard to tell from the source. The client is here: https://github.com/oVirt/ovirt-imageio/blob/master/daemon/ovirt_imageio/client/_api.py#L31 The upload function transfers the image supporting any image format, using multiple connections, and unix socket if possible, using this pipeline: local image -> qemu-nbd -> imageio client -> imageio server On the server side we have similar pipline (since ovirt 4.3): imageio-server -> qemu-nbd -> shared storage This only the transfer; for creating a disk you can use the upload_disk.py module from ovirt-engine-sdk: https://github.com/oVirt/ovirt-engine-sdk/blob/master/sdk/examples/upload_disk.py Some stuff implemented in rhv-upload-plugin like selecting a host is still missing, but we have patches to add that: https://gerrit.ovirt.org/c/109609 We also have features that rhv-upload-plugin does not have, like transferring via proxy if transfer_url is not available: https://gerrit.ovirt.org/c/109305 It is currently significantly faster than "qemu-img convert" with block storage because of bug 1847192 - see tests results here: https://github.com/oVirt/ovirt-imageio/commit/97f2e277458db579023ba54a4a4bd122b36f543e To use this we need to rewrite the rhv-upload flow like this: 1. pre checks 2. run virt-v2v with --no-copy to only do the conversion step 3. run upload_disk.py with the overlay 4. Create the vm with the uploaded disks But note that in my tests, while uploading single disk is much slower with current virt-v2v, uploading 10 disks concurrently is slightly faster. I think because virt-v2v uses imageio file backend which is slightly faster. We will package upload_disk.py as proper command line tool, see bug 1626262. > What we would really like would be an NBD endpoint. I agree, this should be available in 4.4.z. But it will require the same rewrite of the plugin - separating the transfer part from the management part.
Moving to 4.4.2 based on comment 15.
Mordechai please give your qa_ack, it's a scale bug. Thanks.
Since this was moved to 4.4.2 and we started testing it, I don't think the current needinfo requests are needed.
(In reply to Nir Soffer from comment #21) > Since this was moved to 4.4.2 and we started testing it, I don't think the > current needinfo requests are needed. We had a request to push forward on this for 4.4.1 currently we have a few cases remaining and then will update with results.
tested on version : vdsm-4.40.22-1.el8ev.x86_64 rhv-release-4.4.1-10-001.noarch On rhev-red01 with 2 hosts using single SD , ISCSI for uploading disks. and on local NVMe device size 900GB for downloading method. Disk size 100GB , 66% disk Utilization as used in v2v. two supermicro 1029P with 256GB Ram and 32 cores Network setup on the host was the following : 1 network on 1 GiB - display and def route 1 network on 10 GiB - management, 1 network on 10 GiB - migration, and vm The full report can be found here : https://docs.google.com/spreadsheets/d/1QTHnuq5nxRdFLeBD-QN5JRh3s0BrpAZ2L-he4OrrIWE/edit?usp=sharing
BZ 1591439 Summary test result Environment & info : Engine : rhev-red01 Vdsm-4.40.22-1.el8ev.x86_64 Rhv-release-4.4.1-10-001.noarch ovirt-imageio-2.0.9-1 2 supermicro model 1029P with 256GB Ram and 32 cores single SD ISCSI for uploading disks NVMe device size 900GB for downloading. Disk size 100GB , 66% disk Utilization 70GB Network setup on the host was the following : 1 network on 1 GiB - display and def route 1 network on 10 GiB - management, 1 network on 10 GiB - migration, and vm Main cases was local & remote upload and download for single and 10 disks in parallel Measuring throughput in real time using ibmonitor & the API and total time duration for each test case Results : Download Local : 10 disk concurrent - (Duration 52m24.617s) ( Throughput ibmonitor - 258.08 MiB/s - API 3127.51MiB/s ) 1 disk - ( Duration 3m33.361s ) ( Throughput ibmonitor - 377.05MiB/s - API 200.36MiB/s ) Download remote : 10 disks concurrent - (Duration 111m55.285s) Throughput ibmonitor -( ens2f3 (H-26) - 117.58MiB/s ovirtmgmt (H-29) -136.42MiB/s ) API (6700.42 seconds, 15.28 MiB/s) 1 disk - ( Duration 10m23.070s ) ( Throughput ibmonitor - ens2f3 (H-26) - 120.83MiB/s ) API ( ovirtmgmt (H-29) - 116.42MiB/s ) Upload Local : 10 disk concurrent - ( Duration 16m30.925s ) ( Throughput ibmonitor - 877.32 MiB/s - API 108.54MiB/s ) 1 disk - ( Duration 2m11.530s ) ( Throughput ibmonitor - 757.297MiB/s - API 976.64MiB/s ) Upload Remote : 1 disk - ( Duration 10m23.070s ) ( Throughput ibmonitor - ens2f3 (H-26) - 120.83MiB/s ovirtmgmt (H-29) - 116.42MiB/s - API 168.64 MiB/s ) conclusions : The local uploads results was as expected, download local tests performed slower then expected and can likely be improved with running management & Default route both on 10g interface
(In reply to Tzahi Ashkenazi from comment #24) Tzahi, I want to inspect the imageio logs from these tests, but I cannot find the logs in the full report.
hi Nir the full logs and files from the task can be found here : https://drive.google.com/drive/folders/1qqyn7tYEvuHYra4_961_mPJEtvKfKjgP?usp=sharing if you need any other logs please tell me and i can search them on the servers in the link above , there are also the daemons logs that you requested on our last sync
Thanks Tzahi, I need access to this folder.
This bugzilla is included in oVirt 4.4.1 release, published on July 8th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.1 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.