1404962 – [RFE] Add SCSI target passthrough to libvirt for NPIV adapter passthrough

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1404962 - [RFE] Add SCSI target passthrough to libvirt for NPIV adapter passthrough

Summary: [RFE] Add SCSI target passthrough to libvirt for NPIV adapter passthrough

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	7.4
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	John Ferlan
QA Contact:	yisun
Docs Contact:
URL:
Whiteboard:
Depends On:	1404964
Blocks:	1349117 1404963
TreeView+	depends on / blocked

Reported:	2016-12-15 09:12 UTC by Martin Tessun
Modified:	2018-12-06 17:18 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-12-06 17:18:44 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Martin Tessun 2016-12-15 09:12:56 UTC

As described in https://bugzilla.redhat.com/show_bug.cgi?id=1349115#c10 target passthrough is needed for implementing virtual Fibre Channel Adapter in qemu.

* SCSI target passthrough.  This is libvirt only, and requires libvirt to convert nodedev devices + udev hot-add/hot-remove events to QMP commands.  This is independent of NPIV and a requirement for NPIV target passthrough.

Comment 1 Paolo Bonzini 2016-12-16 14:26:22 UTC

This could also be split between cold-plug about hot-plug.  Coldplug only needs access to the nodedev database at the time of VM creation, while hot-plug is more complex and maybe we don't need it.

Comment 2 Martin Tessun 2016-12-16 14:34:37 UTC

(In reply to Paolo Bonzini from comment #1)
> This could also be split between cold-plug about hot-plug.  Coldplug only
> needs access to the nodedev database at the time of VM creation, while
> hot-plug is more complex and maybe we don't need it.

The hot-plug is one of the main features for this, as getting new devices attached to the vPort, they should immediately show up in the Guest OS (VM) as well.
So at least I am sure, we need the hot-plug stuff, but if we can't get this done in RHEL 7.4 timeframe, I think that starting with coldplug first and add hotplug in the next release is the better approach than skipping this completely.

Comment 3 John Ferlan 2016-12-16 18:51:29 UTC

It's not clear from the description what is being passed through. This perhaps comes down to a terminology thing. Is the "target" a vHBA LUN or the vHBA itself?

To decompose things a bit "...requires libvirt to convert nodedev devices + udev hot-add/hot-remove events to QMP commands."... The udev events essentially are what creates the vHBA nodedev device via the storage pool startup processing which essentially "ties" the storage pool to "some" vHBA scsi_hostX nodedev device. Once the scsi_hostX is created, the storage pool code will search for vHBA "targets" in the provided storage pool target path. If a new LUN is provided on a vHBA, it is not automagically added to the storage pool - a storage pool refresh would be required. The generation of QMP commands is how vHBA LUN's are added via domain <disk> XML (either code or hotplug).

Where things are less clear though is what exactly is meant by "...NPIV target passthrough."  This in a way seems to mean taking one of those LUN's and generating a <hostdev> XML using the 'scsi_hostX' and the LUN, but I'm not quite sure. If it is though, then I'm not convinced that's a good way to go.

As an aside, if a new vHBA LUN were added, the nodedev list would be updated with a new "scsi_target_*" entry corresponding to the scsi_hostX that exists (eg scsi_target23_0_4) so having that event be able to somehow morph into adding the new LUN into adding a 'scsi_host23' LUN to a domain would seem feasible.

Currently for vHBA/NPIV, libvirt provides a vHBA LUN as a 'disk' or 'lun' to the guest using domain <disk> XML that provides the source pool and volume of the vHBA LUN. It seems that what is wanted is a <hostdev> for a vHBA LUN, but I want to be sure I'm reading this right.

Perhaps it's easier 'visually'... In libvirt terminology, the desire is for something like this:

<devices>
...
  <hostdev mode='subsystem' type='scsi' [sgio='filtered'] [rawio='yes']>
    <source>
      <adapter name='scsi_hostX'/>
      <address bus='B' target='T' unit='U'/>
    </source>
  </hostdev>
...
</devices>

where 'scsi_hostX' is the vHBA/NPIV adapter created from nodedev-create/storage pool startup running the vport_create command on the HBA. I'm assuming that the 'sgio' and 'rawio' properties still apply...

What's not clear from the description is whether this is passing through a specific LUN via <hostdev> much like is done for iSCSI or is this passthrough a controller (or initiator)?  That is what to use for bus, target, unit values.

From my system:

# virsh pool-dumpxml vhba4_pool
<pool type='scsi'>
  <name>vhba4_pool</name>
  <uuid>189c4056-d378-43da-8647-0a0b92d9536c</uuid>
  <capacity unit='bytes'>0</capacity>
  <allocation unit='bytes'>0</allocation>
  <available unit='bytes'>0</available>
  <source>
    <adapter type='fc_host' parent='scsi_host4' managed='yes' wwnn='2001001b32a9da5e' wwpn='2101001b32a9da5e'/>
  </source>
...

# virsh vol-list vhba4_pool
 Name                 Path                                    
------------------------------------------------------------------------------
 unit:0:4:0           /dev/disk/by-path/pci-0000:10:00.1-fc-0x5006016944602198-lun-0
 unit:0:5:0           /dev/disk/by-path/pci-0000:10:00.1-fc-0x5006016144602198-lun-0

where vhba4_pool is essentially created via steps from:

http://wiki.libvirt.org/page/NPIV_in_libvirt

FWIW: the <disk> examples on the wiki are close, but wrong... The RHEL docs are better:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Deployment_and_Administration_Guide/sect-NPIV_storage.html

It creates 'scsi_host23' as the vHBA, which from lsscsi has:

# lsscsi -lg 23
[23:0:0:0]   enclosu HP       MSA2012fc        J202  -          /dev/sg85
  state=running queue_depth=30 scsi_level=6 type=13 device_blocked=0 timeout=0
[23:0:1:0]   enclosu HP       MSA2012fc        J202  -          /dev/sg86
  state=running queue_depth=30 scsi_level=6 type=13 device_blocked=0 timeout=0
[23:0:2:0]   enclosu HP       MSA2012fc        J202  -          /dev/sg87
  state=running queue_depth=30 scsi_level=6 type=13 device_blocked=0 timeout=0
[23:0:3:0]   enclosu HP       MSA2012fc        J202  -          /dev/sg88
  state=running queue_depth=30 scsi_level=6 type=13 device_blocked=0 timeout=0
[23:0:4:0]   disk    DGC      LUNZ             0429  /dev/sdad  /dev/sg89
  state=running queue_depth=30 scsi_level=5 type=0 device_blocked=0 timeout=30
[23:0:5:0]   disk    DGC      LUNZ             0429  /dev/sdae  /dev/sg90
  state=running queue_depth=30 scsi_level=5 type=0 device_blocked=0 timeout=30


From the above the <disk> XML would be configured using either [unit:0:4:0] or [unit:0:5:0] (from the vol-list output which corresponds to 23:0:4:0 or 23:0:5:0 in lsscsi output).

If the goal is to add <hostdev> support, then the result would be:

  <hostdev mode='subsystem' type='scsi' [sgio='filtered'] [rawio='yes']>
    <source>
      <adapter name='scsi_host23'/>
      <address bus='0' target='4' unit='0'/>
    </source>
  </hostdev>

However, 'scsi_host23' can change between reboots, nodedev-create/nodedev-destroy, or storage pool stop/start cycles. So, I guess I have some concern over doing this as more recent vHBA/NPIV requests have been geared towards a desire to have more consistent way to ensure that the pool startup can always find the "same" parent HBA (see bz 1349696 for some details).

Yes, obviously having hotplug support allays that concern, but it makes cold or configuration quite tricky for the customer since they'd have to change their domain XML when the vHBA scsi_hostX changes.

In any case, I would think a customer would use hotplug <disk> instead because in that model they can guarantee that the <source pool='$NAME' volume='unit:B:T:U'/> remains consistent.

While providing cold definition or hotplug addition of a vHBA LUN is possible, I'm not sure I understand the "value" or "usefulness". Nor is it clear how that leads to being above to provide (what I assume is) the initiator pass through for bz 1340117). Passing through an initiator to me would be passing through the 'scsi_hostX' and allowing the guest to manage the LUN's.

Alternatively is the desire to have some sort of domain <hostdev> XML that mimics what the storage pool is doing. That is, taking as input some parent scsi_host and creating "on the fly" the scsi_hostX in the same manner as the storage pool creates it, except for usage by the domain instead. This scsi_hostX would be available for the lifetime of the domain/guest. This would mean domain hostdev XML that can handle the same source <adapter> XML as the storage pool can (see http://libvirt.org/formatstorage.html and search on fc_host for an example). What gets tricky here is the handling/addition of LUN's into the guest as the nodedev events are not domain based, they're host based.

This would end up with XML such as:

  <hostdev mode='subsystem' type='scsi' sgio='filtered' rawio='yes'>
    <source>
      <adapter type='fc_host' parent='scsi_host4' wwnn='20000000c9831b4b' wwpn='10000000c9831b4b'/>
    </source>
...

where issues raised and XML added in bz 1349696 would also be relevant here.

What's not clear (yet) would be how to deal with the storage pool <target> elements in this case. That is of course the key - how to add those target elements to the guest. I think I've rambled on long enough already though.

Comment 4 Martin Tessun 2016-12-20 16:58:54 UTC

Hi John,

trying to explain this a bit from my PoV.
Paolo: Please add anything that needs to be added from your side. Also feel free to correct me, in case I am wrong.

(In reply to John Ferlan from comment #3)
> It's not clear from the description what is being passed through. This
> perhaps comes down to a terminology thing. Is the "target" a vHBA LUN or the
> vHBA itself?

So the idea with this directive is to tell libvirt to add all LUNs that are presented to the vHBA.

[snip to shorten things]
> Where things are less clear though is what exactly is meant by "...NPIV
> target passthrough."  This in a way seems to mean taking one of those LUN's
> and generating a <hostdev> XML using the 'scsi_hostX' and the LUN, but I'm
> not quite sure. If it is though, then I'm not convinced that's a good way to
> go.

The idea is to have a libvirt representation of the vHBA. In case the VM that has this vHBA attached should present all the LUNs attached to the vHBA at startup.
Once the VM is running, and a new device (maybe nay scsi device) is mapped to the vHBA, this device should also show up in the running VM, as if it would be possible by using a physical FC adapter.
The expected outcome is that the event "adding or removing a LUN from a FC adapter" looks transparent to the VM, so the VM also gets the devices presented the vHBA (NPIV) currently "sees".

[snip]
> Currently for vHBA/NPIV, libvirt provides a vHBA LUN as a 'disk' or 'lun' to
> the guest using domain <disk> XML that provides the source pool and volume
> of the vHBA LUN. It seems that what is wanted is a <hostdev> for a vHBA LUN,
> but I want to be sure I'm reading this right.

Not exactly as I understood your statement correct. We want a "hostdev" directive that is missing the unit in the address and instead forwards all the units (BTW: always unfiltered for this usecase) to the VM.

> Perhaps it's easier 'visually'... In libvirt terminology, the desire is for
> something like this:
> 
> <devices>
> ...
>   <hostdev mode='subsystem' type='scsi' [sgio='filtered'] [rawio='yes']>
>     <source>
>       <adapter name='scsi_hostX'/>
>       <address bus='B' target='T' unit='U'/>

should be:
       <address bus='B' target='T' unit='U'/>
or even just identified as WWPN/WWNN from the NPIV adapter, up to you.

>     </source>
>   </hostdev>
> ...
> </devices>
> 
> where 'scsi_hostX' is the vHBA/NPIV adapter created from
> nodedev-create/storage pool startup running the vport_create command on the
> HBA. I'm assuming that the 'sgio' and 'rawio' properties still apply...

sgio should be unfiltered in this usecase, but having the choice is always good ;)

> What's not clear from the description is whether this is passing through a
> specific LUN via <hostdev> much like is done for iSCSI or is this
> passthrough a controller (or initiator)?  That is what to use for bus,
> target, unit values.

Did my above statements clear this up?

> From my system:
[snip]
>  Name                 Path                                    
> -----------------------------------------------------------------------------
> -
>  unit:0:4:0          
> /dev/disk/by-path/pci-0000:10:00.1-fc-0x5006016944602198-lun-0
>  unit:0:5:0          
> /dev/disk/by-path/pci-0000:10:00.1-fc-0x5006016144602198-lun-0
> 

So in terms of the current "pool" definition, the complete pool should be made visible to the VM and in case there are device changes these need to be propagated to the VM as well, so that the VM has the "impression" that this vHBA is directly "attached" to the VM.

[snip]
> It creates 'scsi_host23' as the vHBA, which from lsscsi has:
> 
> # lsscsi -lg 23
> [23:0:0:0]   enclosu HP       MSA2012fc        J202  -          /dev/sg85
>   state=running queue_depth=30 scsi_level=6 type=13 device_blocked=0
> timeout=0
> [23:0:1:0]   enclosu HP       MSA2012fc        J202  -          /dev/sg86
>   state=running queue_depth=30 scsi_level=6 type=13 device_blocked=0
> timeout=0
> [23:0:2:0]   enclosu HP       MSA2012fc        J202  -          /dev/sg87
>   state=running queue_depth=30 scsi_level=6 type=13 device_blocked=0
> timeout=0
> [23:0:3:0]   enclosu HP       MSA2012fc        J202  -          /dev/sg88
>   state=running queue_depth=30 scsi_level=6 type=13 device_blocked=0
> timeout=0
> [23:0:4:0]   disk    DGC      LUNZ             0429  /dev/sdad  /dev/sg89
>   state=running queue_depth=30 scsi_level=5 type=0 device_blocked=0
> timeout=30
> [23:0:5:0]   disk    DGC      LUNZ             0429  /dev/sdae  /dev/sg90
>   state=running queue_depth=30 scsi_level=5 type=0 device_blocked=0
> timeout=30
> 
> 
> From the above the <disk> XML would be configured using either [unit:0:4:0]
> or [unit:0:5:0] (from the vol-list output which corresponds to 23:0:4:0 or
> 23:0:5:0 in lsscsi output).
> 
> If the goal is to add <hostdev> support, then the result would be:
> 
>   <hostdev mode='subsystem' type='scsi' [sgio='filtered'] [rawio='yes']>
>     <source>
>       <adapter name='scsi_host23'/>
>       <address bus='0' target='4' unit='0'/>
>     </source>
>   </hostdev>

That would only be one device and not all devices that are present to the vHBA. So ideally we would just "leave" the address specification out here, and libvirt takes care to "present" all the devices attached to that scsi_host23 to the VM (whereas, I think for consistency, we should reference the scsi_host23 by WWNN/WWPN).

> However, 'scsi_host23' can change between reboots,

That's why I would prefer WWNN/WWPN naming for that.
So my "example" would look like the following:
   <hostdev mode='subsystem' type='scsi' [sgio='filtered'] [rawio='yes']>
     <source>
       <adapter wwnn='2001001b32a9da5e' wwpn='2101001b32a9da5e' />
     </source>
   </hostdev>

Maybe using a differnt tag than hostdev might be sensible as well.

> nodedev-create/nodedev-destroy, or storage pool stop/start cycles. So, I
> guess I have some concern over doing this as more recent vHBA/NPIV requests
> have been geared towards a desire to have more consistent way to ensure that
> the pool startup can always find the "same" parent HBA (see bz 1349696 for
> some details).

Indeed. Therefore suggesting to using the WWNN/WWPN instead. Ideally this directive can also create the vHBA if it is not present. In this case we probably need to add a parent device section there as well.

> 
> Yes, obviously having hotplug support allays that concern, but it makes cold
> or configuration quite tricky for the customer since they'd have to change
> their domain XML when the vHBA scsi_hostX changes.

That is exactly what I want to avoid, this not using the scsi_host as identifier.

> 
> In any case, I would think a customer would use hotplug <disk> instead
> because in that model they can guarantee that the <source pool='$NAME'
> volume='unit:B:T:U'/> remains consistent.

The thing is that the user should not need to do the hotplug action as it is taken care of by libvirt, as the complete target (all devices attached to the vHBA are presented to the VM)

[snip]
> Alternatively is the desire to have some sort of domain <hostdev> XML that
> mimics what the storage pool is doing. That is, taking as input some parent
> scsi_host and creating "on the fly" the scsi_hostX in the same manner as the
> storage pool creates it, except for usage by the domain instead. This
> scsi_hostX would be available for the lifetime of the domain/guest. This
> would mean domain hostdev XML that can handle the same source <adapter> XML
> as the storage pool can (see http://libvirt.org/formatstorage.html and
> search on fc_host for an example). What gets tricky here is the
> handling/addition of LUN's into the guest as the nodedev events are not
> domain based, they're host based.

Well not sure I understand you, but can you present a complete storage pool including on the fly changes to as VM directly? I think not.
I hope, I did make the requirement more clearly now. I don't want to go too much into implementation details, as long as the requirements are met.

> 
> This would end up with XML such as:
> 
>   <hostdev mode='subsystem' type='scsi' sgio='filtered' rawio='yes'>
>     <source>
>       <adapter type='fc_host' parent='scsi_host4' wwnn='20000000c9831b4b'
> wwpn='10000000c9831b4b'/>
>     </source>

That looks pretty much like the one I propsed earlier, doesnt it?

> ...
> 
> where issues raised and XML added in bz 1349696 would also be relevant here.
> 
> What's not clear (yet) would be how to deal with the storage pool <target>
> elements in this case. That is of course the key - how to add those target
> elements to the guest. I think I've rambled on long enough already though.

Exactly. This is the key. We need to have libvirt presenting all the targets on the vHBA to the VM (and even the changes of these devices, in case some are added/removed).

Comment 5 Paolo Bonzini 2016-12-21 13:02:41 UTC

Again I agree with Martin here (though I'm not sure what the <address bus='B' target='T' unit='U'/>) would be for.  The XML looks (not coincidentially) like the one in bug 1404963.

I'm still not sure on <hostdev> vs. <controller>, but I notice that vhost-scsi is using <hostdev> so maybe the choice has been done already.

Comment 6 John Ferlan 2016-12-21 15:26:13 UTC

Are things clearer - not entirely, but I'll work through it. As they say - the devil is in the details.

Using <hostdev> would seem to imply (at least to me) that the vHBA scsi_hostX is being passed through even though that's not what would happen... It gets an <address> element that wouldn't correspond to anything on the guest. Using <controller> doesn't feel right either, but still an important detail to get worked out early on.

If this used the vhost-scsi - perhaps it would seem the following would work:

  <hostdev mode='subsystem' type='scsi_host'>
    <source>
      <adapter protocol='vhba'
               {parent|parent wwnn/wwpn|parent_fabric_wwn}='...'
               [wwnn='%s' wwpn='%s']/>
      ...
    <address ...>
  </hostdev>

But I'm not sure what <address> would look like... I assume it'd be a "<address type='drive' controller='#'.../>, where controller='#' is some virtio-scsi controller and the bus, target, unit mean nothing. If not provided, then the wwnn/wwpn would be generated similar to how a <nodedev> creation of a vHBA works. 

Alternatively some new XML element such as <vhba>:

  <vhba model='virtio' [controller='#']/>
    <source>
      <adapter .../>
    ...
  </vbha>

where there is no <address> associated. Whether model/controller would be necessary - not clear yet in my mind. Other new device types seem to use it and it ensures we don't have some future issue when/if the model default needs to change. If controller were supplied then a specific virtio-scsi controller would be used; otherwise, one would be created. Not clear (yet) whether/how other hypervisors could/would use this...

At least this way it's far clearer in documenting and in code what the "rules" are for this specific type of device.

FWIW: The "<address bus='B' target='T' unit='U'/>" comes from the existing <hostdev> example for a scsi_host LUN passthru to the domain where:

    <hostdev mode='subsystem' type='scsi' managed='yes'>
      <source>
        <adapter name='scsi_host0'/>
        <address bus='0' target='0' unit='0'/>
      </source>
      <address type='drive' controller='0' bus='0' target='4' unit='8'/>
    </hostdev>

creates qemu command line:

-drive file=/dev/sg0,if=none,id=drive-hostdev0 
-device scsi-generic,bus=scsi0.0,channel=0,scsi-id=4,lun=8,drive=drive-hostdev0,id=hostdev0 

The vHBA equivalent would be having some sort of a storage pool vHBA and then allowing one to choose which LUN would be used. Considering the <disk> syntax:

   <disk type='volume' device='lun'>
     <driver name='qemu' type='raw'/>
     <source pool='poolvhba0' volume='unit:0:4:0'/>
     <target dev='sda' bus='scsi'/>
   </disk>

with the "equivalent" <hostdev> syntax of: 

  <hostdev mode='subsystem' type='scsi' managed='yes'>
    <source>
      <adapter name='scsi_host23'/>
      <address bus='0' target='4' unit='0'/>
    </source>
    <address type='drive' controller='0' bus='0' target='4' unit='8'/>
  </hostdev>

where scsi_host23 is the vHBA. I hadn't gone down the path of what the qemu command line would be (whether a /dev/sgN or /dev/sdN would be used).

Comment 7 Paolo Bonzini 2016-12-22 08:50:43 UTC

> The vHBA equivalent would be having some sort of a storage pool vHBA and then 
> allowing one to choose which LUN would be used.

It would not be one LUN.  *All* LUNs would be passed through, this is why I initially used <controller>.

Note that vhost-scsi also passes through many LUNs, not just one.  I doubt vhost-scsi is used much in the wild, perhaps the <hostdev> syntax for vhost-scsi should be deprecated?!?

Comment 8 John Ferlan 2016-12-22 11:24:45 UTC

Understood it's all LUNs, but they still get added one at a time. Even nodedev gets each scsi_host and scsi_target event one at a time.

Unless you're implying that if a vHBA was presented as the controller then qemu would do the LUN search ~/~

As for vhost-scsi - someone from IBM recently just added that to libvirt. 

I'm still not sure which to use, but perhaps it'll come to me once I'm working through the code. Each of the options has advantages and disadvantages.

Comment 9 Paolo Bonzini 2016-12-22 16:38:56 UTC

> Understood it's all LUNs, but they still get added one at a time.

Yes, but the point of this bug is removing the need to specify all of them in the domain XML.  Of course you may want to specify them in the internal XML for a running domain, but as far as the higher levels are concerned the vHBA element (whatever it is) should not have a SCSI <address>.

I'm not sure I understand why we would need a new <vhba> element.  It wouldn't add anything compared to <controller mode='vhba'> or <hostdev type='scsi_host'>, so one of those two would do.

Comment 10 John Ferlan 2016-12-22 17:09:51 UTC

The "provided" domain XML wouldn't have to supply the LUNs - it was an example of what <hostdev> of a specific LUN would be if it was supported. It's a "response" of sorts to a the comment you made "(though I'm not sure what the <address bus='B' target='T' unit='U'/>)".

Using <hostdev> would then define an <address> which really isn't used. Not that it matters in the long run other than whatever address is created doesn't show up on the guest (whether or not that matters I haven't ascertained).

Extending <controller> is possible, but I think we'd also have to make sure that the <controller> wasn't used by any other device. That is whatever index is used to create the controller wouldn't be usable by any other <disk> or <hostdev> device. Existing controllers can have a <driver>, <model>, or <target> subelements in addition to the <address> subelement.

I see the vHBA as more like a <hostdev>, but needing a <controller> than I do as a pure <controller>. I see a <controller> as something that plugs into buses. So in a way vHBA is a software concept that has a very specific use and very specific feature of being able to provide/remove LUN's to/from the guest based on some outside hardware/software influence. Part of me wonders if there's going to be any similarities with vGPU (but I'm almost afraid to think of that possibility).

Comment 11 Xuesong Zhang 2017-02-28 09:56:41 UTC

Add the proposal patches for reference here, please correct me if anything wrong.

https://www.redhat.com/archives/libvir-list/2017-February/msg00963.html

Comment 12 John Ferlan 2017-02-28 11:07:54 UTC

re: comment 11...

Of those patches only patch 8 of the series comes close to providing the basics for what's needed. Other patches already posted/pushed and this series itself is a lot of setup in order to alter the code to make adding support to domain a lot easier. From the cover of that series:

"The eigth patch is the reason for all this stirring of the pot. Alter
the domain <controller> XML in order to allow definition of a vHBA which
more or less sits between a "scsi_hostX" host device and a controller. This
is in preparation for https://bugzilla.redhat.com/show_bug.cgi?id=1404962
which can take that created vHBA and automagically add the LUNs from the
vHBA to the domain although that requires a bit more magic for which there
are already onlist patches to let qemu driver know when a node device has
been added/removed. Once all that's in place - the next step will be to
converge the two sets of patches. It's a chicken/egg type problem - one
has to exist before the other can truly work."

Comment 13 John Ferlan 2017-03-07 13:08:55 UTC

Being able to automatically do things depends on the event model being supported. Patches that were the basis of that model were not accepted. That leaves this in limbo with respect to viability. That is - why would there no need for a vHBA in the domain if there's no way to have qemu receive node device add/remove "events" directly. That device isn't being "passed through" to the guest - it was only being used as a means to have 1 device (vHBA) map to many devices (LUNs) - a concept that at the architectural level was deemed not a good idea.

Comment 15 John Ferlan 2017-10-04 11:36:10 UTC

The kernel and QEMU BZs have been deferred to RHEL-7.6 as the design is still
under discussion.

Comment 17 John Ferlan 2018-12-06 17:18:44 UTC

Based on email discussion, closing this as WONTFIX since the storage/HBA vendors did not want to invest more in the NPIV/vHBA infrastructure. 

A new methodology is "under discussion".

Note You need to log in before you can comment on or make changes to this bug.