Bug 1715608

Summary: [downstream clone - 4.3.5] [RFE] Create a VDSM hook to pass host disks as disk devices (today we support passing them as SCSI-generic devices)
Product: Red Hat Enterprise Virtualization Manager Reporter: RHV bug bot <rhv-bugzilla-bot>
Component: vdsmAssignee: Milan Zamazal <mzamazal>
Status: CLOSED NEXTRELEASE QA Contact: meital avital <mavital>
Severity: high Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: acarter, agk, bkunal, chlong, fromani, lsurette, mavital, michal.skrivanek, mkalinin, mzamazal, nsoffer, pbonzini, psuriset, rbarry, srevivo, ycui
Target Milestone: ovirt-4.3.5Keywords: FutureFeature, Reopened, ZStream
Target Release: 4.3.5Flags: lsvaty: testing_plan_complete-
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: vdsm-4.30.19 Doc Type: Enhancement
Doc Text:
hostdev_scsi Vdsm hook has been added, to transform some SCSI host devices for better performance. See https://github.com/oVirt/vdsm/blob/master/vdsm_hooks/hostdev_scsi/README for more details.
Story Points: ---
Clone Of: 1470775 Environment:
Last Closed: 2019-06-11 09:57:11 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1470775    
Bug Blocks:    

Description RHV bug bot 2019-05-30 18:59:33 UTC
+++ This bug is a downstream clone. The original bug is: +++
+++   bug 1470775 +++
======================================================================

Description of problem:
(Forgive me for the unclear description, this is an ongoing work item, with some copy-paste from Paolo's email)

1. "The problem is due to RHV using /dev/sgN and the scsi-generic
QEMU device, instead of /dev/sdX and the scsi-block QEMU device." - can we have a hook that uses /dev/sdX ?

2. Ensure it is using 'aio=native'

3. Ensure it is using iothreads (could be done via UI and what we have today)


Chris - can you provide us with log collector attached to this bug, so we'll know what we have there?
Paolo - can you provide 'before' and 'after' XMLs (or qemu command line?).

(Originally by Yaniv Kaul)

Comment 1 RHV bug bot 2019-05-30 18:59:36 UTC
To start
- The engine configuration to understand how the disks are configured at the engine
- the vm xml to understand what libvirt is getting
- the qemu command line from /var/log/libvirt/qemu/vmname.log

(Originally by Nir Soffer)

Comment 2 RHV bug bot 2019-05-30 18:59:38 UTC
Current XML produced by vdsm, uses scsi-generic:

    <hostdev mode='subsystem' type='scsi' managed='no' rawio='yes'>
      <source>
        <adapter name='scsi_host0'/>
        <address bus='0' target='6' unit='0'/>
      </source>
      <alias name='hostdev0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </hostdev>


1) Desired XML for scsi-block (faster/fixed SCSI passthrough):

    <disk type='block' device='lun' rawio='yes'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source dev='/dev/sdd'/>
      <alias name='hostdev0'/>
      <target bus='scsi'/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>


2) Desired XML for scsi-hd (SCSI, not passthrough): same as above, except 

    <disk type='block' device='lun' rawio='yes'>

becomes

    <disk type='block' device='disk'>



In a hook, to get the new source, you can pick the host source address from

      <source>
        <adapter name='scsi_host0'/>
        <address bus='0' target='6' unit='0'/>
      </source>

combine the host:bus:target:unit into 0:0:6:0, then do:

    $ ls /sys/bus/scsi/devices/0:0:6:0/block
    sdd

I guess a shell script would use something like "(cd /sys/bus/scsi/devices/0:0:6:0/block && echo sd*)".


This can be done through either a hook or RHV-M UI.

More food for thought and for separate BZs:

1) in the UI this could be made configurable among scsi-block (passthrough), scsi-hd (emulated), virtio-blk-pci, since each of them has its place.  Many users are going to use SCSI host devices for passthrough (e.g. persistent reservations), others only need pinning.

2) likewise, when passing through an NVMe PCI device we could offer the choice between PCI passthrough, scsi-hd (emulated SCSI device pinned to the host device), virtio-blk-pci (also pinned to the host device).

(Originally by Paolo Bonzini)

Comment 7 RHV bug bot 2019-05-30 18:59:47 UTC
> > - Disks defined as "host devices" should not use scsi-generic at all
>
> Are we sure there's no use case for it? TAPE devices or what not?

Tapes and media changers, yes.

> We do not plan (ATM) to further integrate it into the
> UI - as it's the first time we were asked about it.

Even though it's the first time that you heard about it, I would be surprised if no customer has ever wanted it.  Both AWS and GCE have been offering local SSDs (they are actually transient in their case) for years, and (even though that usecase is closer to OpenStack) I expect that database users will want the same even for "pet" VMs.

> I'm wondering if our default should not be changed, though
> (from scsi-generic to something else). 

I think the default should be scsi-block, yes.

(Originally by Paolo Bonzini)

Comment 8 RHV bug bot 2019-05-30 18:59:48 UTC
Seems the GUI configuration for FC is generating a scsi-block configuration, so hopefully no hook needed for scsi-block.
scsi-hd would need a change, but not sure it's worth it, perhaps we just want to hide/remove the scsi-generic hostdev option

(Originally by michal.skrivanek)

Comment 9 RHV bug bot 2019-05-30 18:59:50 UTC
> the GUI configuration for FC is generating a scsi-block configuration, so
> hopefully no hook needed for scsi-block.

The FC GUI does not guarantee pinning to the host, does it?

The usecase for storage hostdev should be pinning to the host.  The actual device shown to the guest can be configured on top; even if the default is passthrough for backwards compatibility reasons, it makes sense to allow emulated devices (scsi-hd, virtio-blk) for SCSI direct access or PCI NVMe hostdevs.

(Originally by Paolo Bonzini)

Comment 11 RHV bug bot 2019-05-30 18:59:54 UTC
Re-targeting to 4.3.1 since it is missing a patch, an acked blocker flag, or both

(Originally by Ryan Barry)

Comment 13 RHV bug bot 2019-05-30 18:59:57 UTC
This has been open for over a year without a clear use case. Closing. Please re-open if this changes

(Originally by Ryan Barry)

Comment 14 Michal Skrivanek 2019-06-11 09:57:11 UTC
The hook itself works as designed, however it doesn't work together with the rest of the system and as such is not really usable for the intended use case.
Closing the bug, as the hook made it in and will be in 4.3.5. 
A complete solution will need one of: bug 1718852, bug 1718851, bug 1718818

Comment 15 Sandro Bonazzola 2019-06-11 15:35:56 UTC
(In reply to Michal Skrivanek from comment #14)
> The hook itself works as designed, however it doesn't work together with the
> rest of the system and as such is not really usable for the intended use
> case.
> Closing the bug, as the hook made it in and will be in 4.3.5. 
> A complete solution will need one of: bug 1718852, bug 1718851, bug 1718818

No need for QE?

Comment 16 Michal Skrivanek 2019-06-12 14:46:38 UTC
no reason to test a (currently) pointless hook. Once we got it working by solving one (or all) those bugs above the testing of this functionality should happen there. Especially since 1718818 means "no hook" thre's not much point in doing a QE right now....