Bug 1901325 - libvirt initial support for QSD (QEMU Storage Daemon) - TechPreview
Summary: libvirt initial support for QSD (QEMU Storage Daemon) - TechPreview
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: libvirt
Version: 8.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 8.3
Assignee: Virtualization Maintenance
QA Contact: Meina Li
URL:
Whiteboard:
Depends On: 1901323
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-24 20:51 UTC by Ademar Reis
Modified: 2021-01-12 12:59 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-01-12 12:57:38 UTC
Type: Feature Request
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Ademar Reis 2020-11-24 20:51:55 UTC
This is the BZ to track libvirt initial support for QSD.

== Description ==

QSD (QEMU Storage Daemon) is a split of the QEMU storage features from QEMU into a standalone component. This QSD component can provide the features that are usually only available to VMs (through QEMU), to any other consumer through interfaces like fs-mount (nbd, fuse), vhost-user-blk, etc.

With this, one can have qcow2, snapshots, bitmaps/incremental backup and all other block features from QEMU without QEMU and without a running VM.

=== Acceptance Criteria ===

This is the first TechPreview of QSD. It should be possible for users and developers to enable it for experimentation and benchmark purposes, but this can be considered experimental.

Required:
 * Running QSD alongside QEMU (start, stop - in sync is OK)
   * Requires routing QMP commands to the storage daemon
 * Consume QSD storage through vhost-user-blk in QEMU

*Not* required for this first tech preview:
 * Definitive APIs/XML or documentation
 * Handling local mounting of storage from QSD (nbd and/or FUSE)
 * Running QSD as a standalone process

Comment 1 Peter Krempa 2020-11-25 09:03:54 UTC
(In reply to Ademar Reis from comment #0)
> This is the BZ to track libvirt initial support for QSD.

Since we've already discussed a few use-cases and there are already feature requests, I want to clarify the extent of this RFE:

case 1: qemu-storage-daemon bound to the lifecycle of the VM

> Required:
>  * Running QSD alongside QEMU (start, stop - in sync is OK)
>    * Requires routing QMP commands to the storage daemon
>  * Consume QSD storage through vhost-user-blk in QEMU

So from this description I understand that this asks for a libvirt-managed instance of qemu-storage-deaemon which would be fully integrated with the lifecycle and operations of the VM and thus opaque to the user (no "supported" communication with the qemu storage daemon itsef)

case 1a: One storage-daemon per (configured) disk

User can decide that they want to use the qemu-storage-daemon for a specific disk via the <driver name=''> attribute. Storage for the selected disk will be handled via a separate instance of the qemu-storage-daemon.

    <disk type='file' device='disk'>
      <driver name='qemu-storage-daemon' type='qcow2'/>
      <source file='/path/to/img.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </disk>

This is straightforward, but limited in configurability. Possibly hard to extend in the future. Large overhead. No need for new APIs.

case 1b: One storage-daemon per group of disks

User can declare a set of qemu-storage-daemons per VM (similarly how we declare iothreads) and then assign them to individual disks. Declaration of the set ahead allows e.g. for pinning of CPUs of the qemu-storage-daemons and other handling.

The configuration then assigns individual disks so the qemu-storage-daemon to handle it:

    <disk type='file' device='disk'>
      <driver name='qemu-storage-daemon' daemon-id="1" type='qcow2'/>
      <source file='/path/to/img.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </disk>

This has greater flexibility but also complexity. Lifecycle of the qemu-storage-daemon is still tied to the lifecycle of the VM.

case 1c: One storage-daemon per VM

SSIA. This option is mentioned just for completness. Otherwise it's basically just the monolithic qemu with extra steps which can break horribly.

Of all case 1 subcases, 1b is the most useful and fauvourable one, but has the greatest complexity.


Now the other possible cases, which might impact this one which we have already RFE's for:

case 2: qemu-storage-daemon managed separately as internals of a libvirt storage pool object

libvirt's storage driver can be extended to use the qemu-storage-daemon internally to transport storage to qemu. This is an evolution of case 1b above which allows the qemu-storage-daemon instance to be also shared between multiple VMs for resource conservation or similar.

This is requested as:

RFE: qemu-storage-daemon volume XML
https://bugzilla.redhat.com/show_bug.cgi?id=1884667

Declaring a storage volume in the pool backed by the qemu-storage-daemon adds it to the exports and we then can use internal lookup mechanism to refer to it. New storage driver APIs will be needed for operations such as snapshot creation, merging, backups etc. Users will be able to use the APIs even if the VM is down as the storage pool is a separate object here.

A very strongly related usecase for this is:

[RFE] Manage shared NVMe PCI adapters using qemu-storage-daemon
https://bugzilla.redhat.com/show_bug.cgi?id=1829865

This is a special case of the above where you declare volumes from an NVMe device and then use it to share it with multiple VMs which is impossible due to VFIO limitations.

Conclusion: Case 2 is very universal but also involves a lot of work. Comparable or more than case 1b depending on the amount of technical debt.

> *Not* required for this first tech preview:
>  * Definitive APIs/XML or documentation
>  * Handling local mounting of storage from QSD (nbd and/or FUSE)

The above maps the best to 'case 2'. As local mounting in any subcase of case 1 doesn't make sense as the VM is actually using the block device.

>  * Running QSD as a standalone process

case 3:

Tracked separately as:

RFE: vhost-user-blk-pci device support
https://bugzilla.redhat.com/show_bug.cgi?id=1884659

Here libvirt can't provide any storage features. They need to go through the mechanism that manages the qemu storage daemon itself.

If external snapshots of a VM including it's memory state are to be supported we will require special sync points which gives the management mechanism of the qemu-storage-daemon the chance to deal with the storage snapshot before continuing the execution of the VM to ensure consistency of the snapshot. This is tracked as:

RFE: provide API which allows to take memory snapshot in sync with storage when storage is outsourced (e.g. using vhost-user-blk)
https://bugzilla.redhat.com/show_bug.cgi?id=1866400

Comment 2 Peter Krempa 2020-11-25 09:07:09 UTC
Ademar,
from your description it seems that you are asking for case 1, but we need to establish to which extent. Additionally if we want to combine cases 1 and 2, which should be possible it will require some thought as to how to interconnect them.

Case 3 is obviously separate and should be done regardless as exposing the vhost-user-blk interface is beneficial even without the qemu-storage-daemon.


Note You need to log in before you can comment on or make changes to this bug.