Description of problem: Nir has added NBD support to oVirt/imageio. We should support this for -o rhv-upload as soon as possible as it is likely to be better than using the current HTTP path (and easier to maintain). For full details see this email: https://www.redhat.com/archives/libguestfs/2018-December/msg00111.html
Martin, can we consider this for 4.3.0? 4.3.z?
Most of the work is already done on RHV side as part of incremental backup work. The missing parts are: - Provide protocol parameter to image transfer - Update the transfer flow to support nbd only protocol. Mainly skip stuff that is not needed when using nbd, like installing tickets and monitoring progress.
This change requires: - Small change in oVirt API to enable NBD transfer. In this mode engine tranfer_url will be a nbd URL (e.g. "nbd:unix:/path/to/socket") - Adapt engine monitoring logic for NBD transfer, where we don't have any information on progress. Basically this means that engine will not pause the session when client is not active since client activity is not visible in this mode. - rewrite of virt-v2v rhv-upload-plugin internals without changing the user visible behavior. So from QE point of view this require retesting of virt-v2v flow to make sure there are no regressions. If QE has negative flows tests for client inactivity these tests will may break when using NBD protocol, since in this mode we don't have any visibility on client activity.
(In reply to Nir Soffer from comment #6) > This change requires: > > - Small change in oVirt API to enable NBD transfer. In this mode engine > tranfer_url will be a nbd URL (e.g. "nbd:unix:/path/to/socket") There's now an actual specification for NBD URIs, so best to use it rather than making up one or using the deprecated QEMU format: https://github.com/NetworkBlockDevice/nbd/blob/master/doc/uri.md > - Adapt engine monitoring logic for NBD transfer, where we don't have > any information on progress. Basically this means that engine will not > pause the session when client is not active since client activity is > not visible in this mode. Can't progress be approximated by something like the highest byte offset written (not zeroed) by the client? It depends if you need only rough progress or something that is always perfectly accurate. Activity in general is just whether or not the NBD client has sent anything recently. (Of course if the problem is that qemu-nbd isn't providing this information, then use nbdkit + log filter, or a custom filter). http://libguestfs.org/nbdkit-log-filter.1.html http://libguestfs.org/nbdkit-filter.3.html
(In reply to Richard W.M. Jones from comment #7) > (In reply to Nir Soffer from comment #6) > > This change requires: > > > > - Small change in oVirt API to enable NBD transfer. In this mode engine > > tranfer_url will be a nbd URL (e.g. "nbd:unix:/path/to/socket") > > There's now an actual specification for NBD URIs, so best to use it > rather than making up one or using the deprecated QEMU format: > > https://github.com/NetworkBlockDevice/nbd/blob/master/doc/uri.md The URL we use is supported by qemu-nbd, so it is valid even if it is non-standard. qemu syntax predated the standard. Currently we can use only the qemu syntax since imageio does not support the new syntax. I file bug 1849091 for imageio, and bug 1849097 for vdsm which generates these URLs. I'm not sure when the new syntax will be supported. Vdsm can generate the new URL only when all hosts in the DC which may be selected for image transfer support the new syntax, meaning running imageio version that understands the new syntax. Since we allow mixing different versions of vdsm and imageio on different hosts, I'm not sure we have a good way to switch the new syntax before we introduce cluster compatibility version 4.5. Using cluster compatibility version, we can ensure that all hosts are running compatible vdsm version, which imply compatible imageio version. So most likely if we introduce NBD transport in 4.4.z, virt-v2v will have to use the existing nbd:unix:/socket:exportname=disk syntax. > > - Adapt engine monitoring logic for NBD transfer, where we don't have > > any information on progress. Basically this means that engine will not > > pause the session when client is not active since client activity is > > not visible in this mode. > > Can't progress be approximated by something like the highest byte offset > written (not zeroed) by the client? It depends if you need only > rough progress or something that is always perfectly accurate. > Activity in general is just whether or not the NBD client has sent > anything recently. > > (Of course if the problem is that qemu-nbd isn't providing this information, > then use nbdkit + log filter, or a custom filter). > http://libguestfs.org/nbdkit-log-filter.1.html > http://libguestfs.org/nbdkit-filter.3.html Progress is nice to have but not required. More important is being able to detect client activity, which means in imageio extending the ticket on every client request. If a client stops sending requests, engine will pause the transfer after inactivity timeout defined in the transfer. With having client activity, we will have to disable the inactivity timeout, or use very high value since we are not in the loop when transferring via NBD. We cannot use nbdkit because of 2 reasons: - It does not support qcow2 - It does not support direct I/O. Not using direct I/O is very bad in oVirt, and can lead to sanlock renewal timeouts, which may lead to expiring leases and killing vdsm or vms holding leases. For example see: https://bugzilla.redhat.com/show_bug.cgi?id=1247135#c30 This is the reason all oVirt disk operations use direct I/O. When we are converting images with qemu-img convert we always use "-t none -T none". We run qemu-nbd with "--cache=none" And imageio file backend always uses direct I/O. direct I/O + fsync() is also usually faster then buffered I/O and gives more predictable behavior.