1420740 – new scsi pool with existing HBA's wwn autostarted when restart libvirtd

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1420740 - new scsi pool with existing HBA's wwn autostarted when restart libvirtd

Summary: new scsi pool with existing HBA's wwn autostarted when restart libvirtd

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	7.4
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	7.4
Assignee:	John Ferlan
QA Contact:	yisun
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-02-09 12:39 UTC by Yanqiu Zhang
Modified:	2017-08-02 00:01 UTC (History)
CC List:	11 users (show)
Fixed In Version:	libvirt-3.2.0-7.el7
Doc Type:	No Doc Update
Doc Text:	undefined
Clone Of:
Environment:
Last Closed:	2017-08-01 17:21:45 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2017:1846	0	normal	SHIPPED_LIVE	libvirt bug fix and enhancement update	2017-08-01 18:02:50 UTC

Description Yanqiu Zhang 2017-02-09 12:39:45 UTC

Description of problem:
new a scsi pool with existing HBA's wwn, it can not pool-start since it's invalid, but it is autostarted when restart libvirtd


Version-Release number of selected component (if applicable):
libvirt-3.0.0-2.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.define a new scsi pool with existing HBA's wwn
# ls /sys/class/fc_host
host1 host4
# virsh nodedev-dumpxml scsi_host1|grep ww
      <wwnn>20000024ff370144</wwnn>
      <wwpn>21000024ff370144</wwpn>
...
# touch 18591-pool.xml
<pool type='scsi'>
<name>18591-pool</name>
<source>
<adapter type='fc_host' wwnn='20000024ff370144' wwpn='21000024ff370144'/>
</source>
<target>
<path>/dev/disk/by-path/</path>
<permissions>
<mode>0755</mode>
<owner>-1</owner>
<group>-1</group>
</permissions>
</target>
</pool>

# virsh pool-define 18591-pool.xml
Pool 18591-pool defined from 18591-pool.xml

2.try to start it
# virsh pool-start 18591-pool
error: Failed to start pool 18591-pool
error: An error occurred, but the cause is unknown
# virsh pool-list --all
 Name                 State      Autostart
-------------------------------------------
 18591-pool           inactive   no 
 boot-scratch         active     yes      
 default              active     yes  

3. restart libvirtd
# systemctl restart libvirtd

4.check the pool status
# virsh pool-list
 Name                 State      Autostart
-------------------------------------------
 18591-pool           active     no        
 boot-scratch         active     yes      
 default              active     yes   

Actual results:
As in step4, the new scsi pool with existing HBA's wwn autostarted when restart libvirtd

Expected results:
18591-pool has existing wwpn:wwnn of HBA host1, it should not be started even when restart libvirtd.

Additional info:

Comment 2 John Ferlan 2017-04-04 18:57:29 UTC

I reproduced this with a checked out version of 3.0.0, but it is fixed as of 3.2.0.

The following commit "broke" the code that would allow creating a pool using the HBA of a scsi_host device that is also an fc_host device:

79ab093518ffd07724eb0f8950a07b81d878e9b3

Specifically, in the code:

     if ((name = virGetFCHostNameByWWN(NULL, adapter->data.fchost.wwnn,
                                       adapter->data.fchost.wwpn))) {
-        int retval = 0;
-
         /* If a parent was provided, let's make sure the 'name' we've
          * retrieved has the same parent
          */
         if (adapter->data.fchost.parent &&
-            !checkVhbaSCSIHostParent(conn, name, adapter->data.fchost.parent))
-            retval = -1;
+            checkVhbaSCSIHostParent(conn, name, adapter->data.fchost.parent))
+            ret = 0;
 
-        VIR_FREE(name);
-        return retval;
+        goto cleanup;

This code "decides" whether the wwnn/wwpn already exists (which it would for an HBA backed 'fc_host' pool). However, since no 'fchost.parent' would be provided we'd jump to cleanup return 'ret' which is -1 at this point.

The 'virStorageBackendSCSICheckPool' code is far less stringent - thus the pool could be started.

If the above check was changed to:

    if (!adapter->data.fchost.parent || (adapter->data.fchost.parent &&
        !checkVhbaSCSIHostParent(conn, name, adapter->data.fchost.parent)))

then the pool would have started normally.


Really long story short though, by 3.2 after a lot of "manipulation" and source code motion to combine the nodedevice and storagepool vHBA logic, this issue is "fixed" by commit 08c0ea16fcd3a4bf119ba3675431acfe75f29849 which now is part of the virNodeDeviceCreateVport API and would return the 'name' of the scsi_host device:

    if ((name = virVHBAGetHostByWWN(NULL, fchost->wwnn, fchost->wwpn))) {
        /* If a parent was provided, let's make sure the 'name' we've
         * retrieved has the same parent. If not this will cause failure. */
        if (fchost->parent && checkParent(conn, name, fchost->parent))
            VIR_FREE(name);

        return name;
    }

So that's the good news... Unfortunately even though start was fixed, in 3.2 destroy is broken...


$ virsh pool-destroy host4_hba_pool
error: Failed to destroy pool host4_hba_pool
error: internal error: Invalid adapter name 'pci_0000_10_00_1' for SCSI pool

$

I've sent the following patch upstream to resolve this issue:

https://www.redhat.com/archives/libvir-list/2017-April/msg00158.html

Since no one complained upstream about the start issue, I'll assume no one noticed. Since RHEL 7.4 will be using a libvirt-3.2.0 based release, once this gets accepted/pushed upstream - I'll use this bz for the downstream patch.  Unless another one magically appears on my doorstep...

Comment 3 John Ferlan 2017-04-13 14:34:46 UTC

Patch now pushed upstream:

commit 84f178bdc7ab38011cc1f76759b0a41335285a4f
Author: John Ferlan <jferlan>
Date:   Tue Apr 4 14:06:42 2017 -0400

    conf: Add check for non scsi_host parent during vport delete
    
...
    
    If the parent is not a scsi_host, then we can just happily return since
    we won't be removing a vport.
    
    Fixes a bug with the following output:
    
    $ virsh pool-destroy host4_hba_pool
    error: Failed to destroy pool host4_hba_pool
    error: internal error: Invalid adapter name 'pci_0000_10_00_1' for SCSI pool
    
    $

$ git describe 84f178bdc7ab38011cc1f76759b0a41335285a4f
v3.2.0-159-g84f178b
$

Comment 6 yisun 2017-04-27 08:16:47 UTC

Currently, the behavior is:
If the scsi pool's xml contains wwpn/wwnn equals to a exiting HBA's wwnn/wwpn, the pool can be started. And actually no vHBA created(no vprot_create sent out?), the vols of the pool are same as the physical HBA's connecting luns.
With your fix, you tried to avoid sending a vport_delete if the parent is not a scsi_host (actually it's a PCI in this situation), to avoid pool-destroy failure.

Well, your fix solve part of the problem, since we definitly can indicate a **parent** in the scsi pool's xml. And this will also trigger the pool-destroy failure. 

two examples as follow:
====================================================
Explicitly indicate **parent='scsi_hostX'**, and pool.xml has wwpn/wwnn equals to scsi_hostX's wwpn/wwnn
====================================================
Steps:
1. scsi_host7 is physical HBA, with xml as follow:
## virsh nodedev-dumpxml scsi_host7
<device>
  <name>scsi_host7</name>
  <path>/sys/devices/pci0000:00/0000:00:01.0/0000:20:00.1/host7</path>
  <parent>pci_0000_20_00_1</parent>
  <capability type='scsi_host'>
    <host>7</host>
    <unique_id>1</unique_id>
    <capability type='fc_host'>
      <wwnn>20000000c99e2b81</wwnn>
      <wwpn>10000000c99e2b81</wwpn>
      <fabric_wwn>2001547feeb71cc1</fabric_wwn>
    </capability>
    <capability type='vport_ops'>
      <max_vports>255</max_vports>
      <vports>0</vports>
    </capability>
  </capability>
</device>

2. prepare a pool.xml with **parent='scsi_host7' wwnn='20000000c99e2b81' wwpn='10000000c99e2b81'**
## cat pool.xml
## cat npiv2.pool 
<pool type='scsi'>
  <name>npiv2</name>
  <source>
    <adapter type='fc_host' parent='scsi_host7' managed='yes' wwnn='20000000c99e2b81' wwpn='10000000c99e2b81'/>
  </source>
  <target>
    <path>/dev/disk/by-path</path>
  </target>
</pool>

3. try to define & start the pool
## virsh pool-define pool.xml
Pool npiv2 defined from pool.xml

## virsh pool-start npiv2
Pool npiv2 started
<== the vols in the pool are same as luns connected by scsi_host7

4. destroy it
## virsh pool-destroy npiv2
error: Failed to destroy pool npiv2
error: Write of '10000000c99e2b81:20000000c99e2b81' to '/sys/class/fc_host/host7/vport_delete' during vport create/delete failed: No such device
<=== because no vHBA with 10000000c99e2b81:20000000c99e2b81 actually created, so vport_delete returns error and pool cannot be destroied


====================================================
Explicitly indicate **parent='scsi_hostX'**, and scsi_hostX is not even a HBA 
====================================================
1. find out a scsi_host which is not a HBA device
## virsh nodedev-list --cap=fc_host
scsi_host0
scsi_host7

## virsh nodedev-list --cap=scsi_host
scsi_host0
scsi_host1 <<==== let's use this one, actually it's a SATA hard disk
scsi_host2
scsi_host3
scsi_host4
scsi_host5
scsi_host6
scsi_host7

2. prepare the pool.xml, with **parent='scsi_host1'**
## cat pool.xml
<pool type='scsi'>
  <name>npiv2</name>
  <source>
    <adapter type='fc_host' parent='scsi_host1' managed='yes' wwnn='20000000c99e2b81' wwpn='10000000c99e2b81'/>
  </source>
  <target>
    <path>/dev/disk/by-path</path>
  </target>
</pool>

3. define and start the pool
## virsh pool-define pool.xml
Pool npiv2 defined from pool.xml

## virsh pool-start npiv2
Pool npiv2 started
<== the vols in the pool are same as luns connected by scsi_host7

4. destroy the pool
## virsh pool-destroy npiv2
error: Failed to destroy pool npiv2
error: Requested operation is not valid: vport operation 'vport_delete' is not supported for host1
<=== well, it's not a fc_host, so doesn't even have a vport_delete operation...

==============
maybe we have 2 possible ways to deal with it:
1. just send a vport_create to the **parent**, and if parent cannot vport_create the device, then error returned, and pool just cannot be started. And this may expose problem of commnet 0 again.

2. if pool.xml's wwpn/wwnn same as one exiting HBA's wwpn/wwnn, just automatically remove the pool.xml's **parent** part when pool-define/create/create-as

pls have a double check, thx. I'll move it to ASSIGNED for now.

Comment 7 John Ferlan 2017-04-29 15:36:39 UTC

Well you just keep finding more strange and bizarre ways to configure things.

So the creation avoids the "vport_create" logic because of the following logic:

virNodeDeviceCreateVport:

    /* If we find an existing HBA/vHBA within the fc_host sysfs
     * using the wwnn/wwpn, then a nodedev is already created for
     * this pool and we don't have to create the vHBA
     */
    if ((name = virVHBAGetHostByWWN(NULL, fchost->wwnn, fchost->wwpn))) {
        /* If a parent was provided, let's make sure the 'name' we've
         * retrieved has the same parent. If not this will cause failure. */
        if (fchost->parent && checkParent(conn, name, fchost->parent))
            VIR_FREE(name);

        return name;
    }

But because you supplied a "parent" on the input, it avoids the logic added to resolve the bug from the original bz:

virNodeDeviceDeleteVport:

    /* Find our vHBA by searching the fc_host sysfs tree for our wwnn/wwpn */
    if (!(name = virVHBAGetHostByWWN(NULL, fchost->wwnn, fchost->wwpn))) {
        virReportError(VIR_ERR_INTERNAL_ERROR,
                       _("Failed to find fc_host for wwnn='%s' and wwpn='%s'"),
                       fchost->wwnn, fchost->wwpn);
        goto cleanup;
    }

   /* If at startup time we provided a parent, then use that to
     * get the parent_host value; otherwise, we have to determine
     * the parent scsi_host which we did not save at startup time
     */
    if (fchost->parent) {
        if (virSCSIHostGetNumber(fchost->parent, &parent_host) < 0)
            goto cleanup;
    } else {
...
        /* If the parent is not a scsi_host, then this is a pool backed
         * directly to an HBA and there's no vHBA to remove - so we're done */
        if (!STRPREFIX(vhba_parent, "scsi_host")) {
            ret = 0;
            goto cleanup;
        }
...
    }

Still in a way the above scenario is I think a new and inventive way to work through the code. Whether someone actually (mis)configures in that manner who knows.  I honestly don't think they would, but I suppose it is possible.

As for the second scenario again - whether someone actually does something like this is on the edges of questionability. Also you note:

> 3. define and start the pool
> ## virsh pool-define pool.xml
> Pool npiv2 defined from pool.xml
>
> ## virsh pool-start npiv2
> Pool npiv2 started
> <== the vols in the pool are same as luns connected by scsi_host7

I hope you mean scsi_host1 and not scsi_host7; otherwise, something is seriously wrong.


So this "issue" is slightly different than the one presented in the original problem statement, but it does represent a code coverage type issue/problem.

The fix would be to detect at shutdown that when a parent name was provided whether it's the same name as what was found by searching on the wwnn/wwpn and just return, e.g.

         goto cleanup;
     }
 
+    if (virAsprintf(&scsi_host_name, "scsi_%s", name) < 0)
+        goto cleanup;
+
     /* If at startup time we provided a parent, then use that to
      * get the parent_host value; otherwise, we have to determine
      * the parent scsi_host which we did not save at startup time
      */
     if (fchost->parent) {
+        /* Someone provided a parent string at startup time that
+         * was the same as the scsi_host - meaning we have a pool
+         * backed to an HBA, so there won't be a vHBA to delete */
+        if (STREQ(scsi_host_name, fchost->parent)) {
+            ret = 0;
+            goto cleanup;
+        }
+
         if (virSCSIHostGetNumber(fchost->parent, &parent_host) < 0)
             goto cleanup;
     } else {
-        if (virAsprintf(&scsi_host_name, "scsi_%s", name) < 0)
-            goto cleanup;
-
         if (!(vhba_parent = virNodeDeviceGetParentName(conn, scsi_host_name)))
             goto cleanup;


I have posted a patch with this fix upstream:

https://www.redhat.com/archives/libvir-list/2017-April/msg01498.html

Comment 8 John Ferlan 2017-05-24 14:17:41 UTC

The patch has been pushed upstream:

commit 2c8e30ee7e287d6490f643ccd2d7653a834e75e5
Author: John Ferlan <jferlan>
Date:   Sat Apr 29 11:29:27 2017 -0400

    conf: Resolve corner case on fc_host deletion
    
...
    
    Testing found an inventive way to cause an error at shutdown by providing the
    parent name for the fc host creation using the "same name" as the HBA. Since
    the code thus assumed the parent host name provided was the parent HBA and
    just extracted out the host number and sent that along to the vport_destroy
    this avoided checks made for equality.
    
    So just add the equality check to that path to resolve.

Comment 10 yisun 2017-06-02 09:58:11 UTC

(In reply to John Ferlan from comment #7)
> Well you just keep finding more strange and bizarre ways to configure things.

> > 3. define and start the pool
> > ## virsh pool-define pool.xml
> > Pool npiv2 defined from pool.xml
> >
> > ## virsh pool-start npiv2
> > Pool npiv2 started
> > <== the vols in the pool are same as luns connected by scsi_host7
> 
> I hope you mean scsi_host1 and not scsi_host7; otherwise, something is
> seriously wrong.
> 
Actually I meant scsi_host7 indeed. I used following pool xml
<pool type='scsi'>
  <name>npiv2</name>
  <source>
    <adapter type='fc_host' parent='scsi_host1' managed='yes' wwnn='20000000c99e2b81' wwpn='10000000c99e2b81'/>
  </source>
  <target>
    <path>/dev/disk/by-path</path>
  </target>
</pool>

And in the xml, the wwpn/wwnn is actually from scsi_host7, but I explicitly set parent='scsi_host1' (well, host1 is a sata disk, not hba card at all).

Comment 11 yisun 2017-06-02 10:07:00 UTC

And now there are two problems pls help to confirm
1. let's back to what reporter found -
when scsi_pool_autostart=no and pool_status=inactive, she restarted libvirtd and the pool is autostarted. 

So I think one of her expectation is that:
If scsi pool is inactive and autostart=no, restarting libvirtd should not activate the pool. 

=== and current test result is: ===
## virsh pool-dumpxml bugtest
<pool type='scsi'>
  <name>bugtest</name>
  <uuid>5954cb40-f3a2-4447-9c0d-339789205544</uuid>
  <capacity unit='bytes'>85899345920</capacity>
  <allocation unit='bytes'>85899345920</allocation>
  <available unit='bytes'>0</available>
  <source>
    <adapter type='fc_host' managed='yes' wwnn='20000000c99e2b81' wwpn='10000000c99e2b81'/>
  </source>
  <target>
    <path>/dev/disk/by-path</path>
    <permissions>
      <mode>0755</mode>
    </permissions>
  </target>
</pool>

## virsh pool-list --all
 Name                 State      Autostart 
-------------------------------------------
 bugtest              active     no        

(destroy the pool to make it state=inactive and autostart=no)
## virsh pool-destroy bugtest; virsh pool-list --all
Pool bugtest destroyed
 Name                 State      Autostart 
-------------------------------------------
 bugtest             *** inactive ***  no        

## service libvirtd restart
Redirecting to /bin/systemctl restart  libvirtd.service

## virsh pool-list --all
 Name                 State      Autostart 
-------------------------------------------
 bugtest           ***  active ***     no        
<==== It's autostarted.. not as expected

Comment 12 yisun 2017-06-02 10:11:34 UTC

2. as comment 10 mentioned, this kind of pool still cannot be destroyed. 
# virsh pool-dumpxml bugtest
<pool type='scsi'>
  <name>bugtest</name>
  <uuid>b10a124b-e2d8-4f0e-ad2a-c029e69566ad</uuid>
  <capacity unit='bytes'>85899345920</capacity>
  <allocation unit='bytes'>85899345920</allocation>
  <available unit='bytes'>0</available>
  <source>
    <adapter type='fc_host' parent='scsi_host1' managed='yes' wwnn='20000000c99e2b81' wwpn='10000000c99e2b81'/>
  </source>
  <target>
    <path>/dev/disk/by-path</path>
    <permissions>
      <mode>0755</mode>
    </permissions>
  </target>
</pool>
*** scsi_host1 is a sata disk ***
*** wwpn/wwnn is from scsi_host7 which is a hba card ***

## virsh pool-destroy bugtest
error: Failed to destroy pool bugtest
error: Requested operation is not valid: vport operation 'vport_delete' is not supported for host1
*** now it's still failed since libvirt tried to send a vport_delete to a non-fc host.

Comment 13 John Ferlan 2017-06-04 13:21:46 UTC

(In reply to yisun from comment #11)
> And now there are two problems pls help to confirm
> 1. let's back to what reporter found -
> when scsi_pool_autostart=no and pool_status=inactive, she restarted libvirtd
> and the pool is autostarted. 
> 
> So I think one of her expectation is that:
> If scsi pool is inactive and autostart=no, restarting libvirtd should not
> activate the pool. 
> 

I didn't read the original post having anything to do with explicitly setting or unsetting autostart.  It's not something I'd "considering" when it comes to vHBA/NPIV.

A vHBA/NPIV pool by it's sheer existence is a bit of an oddity w/r/t normal autostart rules and functionality. As alluded to in comment 2, when libvirtd restarts it runs through the @checkPool code via storagePoolUpdateState that's a different path than @startPool.

> === and current test result is: ===
> ## virsh pool-dumpxml bugtest
> <pool type='scsi'>
>   <name>bugtest</name>
>   <uuid>5954cb40-f3a2-4447-9c0d-339789205544</uuid>
>   <capacity unit='bytes'>85899345920</capacity>
>   <allocation unit='bytes'>85899345920</allocation>
>   <available unit='bytes'>0</available>
>   <source>
>     <adapter type='fc_host' managed='yes' wwnn='20000000c99e2b81'
> wwpn='10000000c99e2b81'/>
>   </source>
>   <target>
>     <path>/dev/disk/by-path</path>
>     <permissions>
>       <mode>0755</mode>
>     </permissions>
>   </target>
> </pool>
> 
> ## virsh pool-list --all
>  Name                 State      Autostart 
> -------------------------------------------
>  bugtest              active     no        
> 
> (destroy the pool to make it state=inactive and autostart=no)
> ## virsh pool-destroy bugtest; virsh pool-list --all
> Pool bugtest destroyed
>  Name                 State      Autostart 
> -------------------------------------------
>  bugtest             *** inactive ***  no        
> 
> ## service libvirtd restart
> Redirecting to /bin/systemctl restart  libvirtd.service
> 
> ## virsh pool-list --all
>  Name                 State      Autostart 
> -------------------------------------------
>  bugtest           ***  active ***     no        
> <==== It's autostarted.. not as expected

Because of the above explanation and the code in virStorageBackendSCSICheckPool which checks for the existence of the adapter by name (in this case by sheer existence of a wwnn/wwpn - see getAdapterName and the call to virVHBAGetHostByWWN) - the checkPool call returns/sets @active=true.

AFAIK - this is no change to the previous RHEL release, but I would also state it's working as designed.

IOW: Autostart start functionality is a non sequitor here since by the sheer existence of the scsi_host that the pool describes means that at start or restart it's considered active.

Comment 14 John Ferlan 2017-06-04 14:31:08 UTC

(In reply to yisun from comment #12)
> 2. as comment 10 mentioned, this kind of pool still cannot be destroyed. 
> # virsh pool-dumpxml bugtest
> <pool type='scsi'>
>   <name>bugtest</name>
>   <uuid>b10a124b-e2d8-4f0e-ad2a-c029e69566ad</uuid>
>   <capacity unit='bytes'>85899345920</capacity>
>   <allocation unit='bytes'>85899345920</allocation>
>   <available unit='bytes'>0</available>
>   <source>
>     <adapter type='fc_host' parent='scsi_host1' managed='yes'
> wwnn='20000000c99e2b81' wwpn='10000000c99e2b81'/>
>   </source>
>   <target>
>     <path>/dev/disk/by-path</path>
>     <permissions>
>       <mode>0755</mode>
>     </permissions>
>   </target>
> </pool>
> *** scsi_host1 is a sata disk ***
> *** wwpn/wwnn is from scsi_host7 which is a hba card ***
> 
> ## virsh pool-destroy bugtest
> error: Failed to destroy pool bugtest
> error: Requested operation is not valid: vport operation 'vport_delete' is
> not supported for host1
> *** now it's still failed since libvirt tried to send a vport_delete to a
> non-fc host.

I don't think this is the same issue/problem as the original post. It's hard to tell for sure though. In the original post scsi_host1 was an HBA... In your followup comment 6 scsi_host1 is a sata, but some other scsi_host is the HBA. It's all quite confusing and perhaps you can understand why these types of things require separate bugs.

Even though my assumption about scsi_host7 was wrong and I can essentially reproduce what's described here, but it'll require a separate bug for fixing and that wouldn't be until the "next" release.

I added some debug code in the creation path and came up with the following:

2017-06-04 14:21:18.990+0000: 29085: warning : createVport:227 : conn=0x7fffd00009a0, configFile='<null>' parent='scsi_host1', wwnn='20000000c9848140' wwpn='10000000c9848140'
2017-06-04 14:21:18.990+0000: 29085: warning : virNodeDeviceCreateVport:2328 : conn=0x7fffd00009a0, parent='scsi_host1', wwnn='20000000c9848140' wwpn='10000000c9848140'
2017-06-04 14:21:18.990+0000: 29085: warning : virNodeDeviceCreateVport:2335 : name=host3 parent=scsi_host1
2017-06-04 14:21:18.990+0000: 29085: warning : checkParent:2275 : conn=0x7fffd00009a0, name=host3, parent_name=scsi_host1
2017-06-04 14:21:18.990+0000: 29085: warning : checkParent:2286 : vhba_parent=pci_0000_10_00_0 parent_name=scsi_host1
2017-06-04 14:21:18.990+0000: 29085: error : checkParent:2292 : XML error: Parent attribute 'scsi_host1' does not match parent 'pci_0000_10_00_0' determined for the 'host3' wwnn/wwpn lookup.
2017-06-04 14:21:18.990+0000: 29085: warning : createVport:245 : create vport returns name=host3


The means the following code:

virNodeDeviceCreateVport:

    if ((name = virVHBAGetHostByWWN(NULL, fchost->wwnn, fchost->wwpn))) {
        /* If a parent was provided, let's make sure the 'name' we've
         * retrieved has the same parent. If not this will cause failure. */
        if (fchost->parent && checkParent(conn, name, fchost->parent))
            VIR_FREE(name);

        return name;
    }

is returning @name even though checkParent has failed. Perhaps because it should have been !checkParent, but IIRC this path is used for the case where the wwnn/wwpn is for the HBA (IOW: don't create a vHBA because the pool is looking at the HBA). I need to do a bit more digging on it, but since it's not related to the base problem here - let's not bog this bz down with a different type problem.

Comment 15 yisun 2017-06-05 09:47:14 UTC

according to above comments, move this bug to VERIFIED, and a new bz 1458708 created.

Versions:
libvirt-libs-3.2.0-7.el7.x86_64

Steps:
# virsh nodedev-dumpxml scsi_host4
<device>
  <name>scsi_host4</name>
  <path>/sys/devices/pci0000:00/0000:00:09.0/0000:18:00.1/host4</path>
  <parent>pci_0000_18_00_1</parent>
  <capability type='scsi_host'>
    <host>4</host>
    <unique_id>4</unique_id>
    <capability type='fc_host'>
      <wwnn>20000024ff370145</wwnn>
      <wwpn>21000024ff370145</wwpn>
      <fabric_wwn>2001000dec9877c1</fabric_wwn>
    </capability>
    <capability type='vport_ops'>
      <max_vports>254</max_vports>
      <vports>0</vports>
    </capability>
  </capability>
</device>

# cat pool.xml
<pool type='scsi'>
  <name>hba</name>
  <source>
    <adapter type='fc_host' managed='yes' wwnn='20000024ff370145' wwpn='21000024ff370145'/>
  </source>
  <target>
    <path>/dev/disk/by-path</path>
  </target>

# virsh pool-define pool.xml
Pool hba defined from pool.xml

# virsh pool-start hba
Pool hba started

# virsh vol-list hba
 Name                 Path                                    
------------------------------------------------------------------------------
 unit:0:0:0           /dev/disk/by-path/pci-0000:18:00.1-fc-0x203500a0b85b0acc-lun-0
 unit:0:0:1           /dev/disk/by-path/pci-0000:18:00.1-fc-0x203500a0b85b0acc-lun-1

# virsh pool-destroy hba
Pool hba destroyed

# virsh pool-undefine hba
Pool hba has been undefined

Comment 16 errata-xmlrpc 2017-08-01 17:21:45 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846

Comment 17 errata-xmlrpc 2017-08-02 00:01:15 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846

Note You need to log in before you can comment on or make changes to this bug.