Bug 1171984

Summary: libvirt should also check if hostname is the same with ip address when we try to define two pool have same source
Product: Red Hat Enterprise Linux 7 Reporter: Luyao Huang <lhuang>
Component: libvirtAssignee: John Ferlan <jferlan>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.1CC: dyuan, jferlan, mzhan, rbalakri, shyu, xuzhang, yanyang, yisun
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-1.2.16-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-19 05:57:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Luyao Huang 2014-12-09 07:11:52 UTC
Description of problem:
libvirt should also check if hostname is the same with ip address when we try to define two pool have same source pool

Version-Release number of selected component (if applicable):
libvirt-1.2.8-10.el7.x86_64

How reproducible:
100%

Steps to Reproduce:

1.define 2 iscsi pool one use hostname and another use ip address
# virsh pool-dumpxml test-iscsi3
<pool type='iscsi'>
  <name>test-iscsi3</name>
  <uuid>4856026d-f1aa-407c-b17a-b51d5ba5671e</uuid>
  <capacity unit='bytes'>524288000</capacity>
  <allocation unit='bytes'>524288000</allocation>
  <available unit='bytes'>0</available>
  <source>
    <host name='10.66.6.12' port='3260'/>
    <device path='iqn.2014-11.com.lhuang:tgt1'/>
  </source>
  <target>
    <path>/dev/disk/by-path</path>
    <permissions>
      <mode>0755</mode>
      <owner>-1</owner>
      <group>-1</group>
    </permissions>
  </target>
</pool>
# virsh pool-dumpxml test-iscsi2
<pool type='iscsi'>
  <name>test-iscsi2</name>
  <uuid>2085e8d9-8bee-458d-ae7c-d0f8bae1edf5</uuid>
  <capacity unit='bytes'>524288000</capacity>
  <allocation unit='bytes'>524288000</allocation>
  <available unit='bytes'>0</available>
  <source>
    <host name='test1' port='3260'/>
    <device path='iqn.2014-11.com.lhuang:tgt1'/>
  </source>
  <target>
    <path>/dev/disk/by-path</path>
    <permissions>
      <mode>0755</mode>
      <owner>-1</owner>
      <group>-1</group>
    </permissions>
  </target>
</pool>

2.check /etc/hosts
10.66.6.12 test1

3.start both of them
# virsh pool-start test-iscsi2
Pool test-iscsi2 started
# virsh pool-start test-iscsi3
Pool test-iscsi3 started

4.there will be a lot of strange issue after can define 2 same iscsi pool.
just like if one have been destroyed another cannot destroyed:
# virsh pool-list --all
 Name                 State      Autostart
-------------------------------------------
 test-iscsi2          active     no
 test-iscsi3          active     no

# virsh pool-destroy test-iscsi2
Pool test-iscsi2 destroyed

# virsh pool-destroy test-iscsi3
error: Failed to destroy pool test-iscsi3
error: internal error: Child process (iscsiadm --mode node --portal 10.66.6.12:3260,1 --targetname iqn.2014-11.com.lhuang:tgt1 --logout) unexpected exit status 21: iscsiadm: No matching sessions found

# virsh pool-list --all
 Name                 State      Autostart
-------------------------------------------
 test-iscsi2          inactive   no
 test-iscsi3          active     no


Actual results:
no error when we define them and no error when we start them

      
Expected results:

report error when we try to define 2 pool have the same source

# virsh pool-edit test-iscsi3
error: operation failed: Storage source conflict with pool: 'test-iscsi2'
Failed. Try again? [y,n,f,?]:

Additional info:
libvirt should forbid two iscsi pool use the same source(although one is hostname another is ip)

Comment 1 Yang Yang 2014-12-09 10:04:24 UTC
Actually, libvirt does not check the portal when there exists a session on the target.
e.g. If a running iscsi pool works on a target, then another iscsi pool startup is okay using the same target, no matter what the host name is. However, its shutdown will lead error out as the invalid host name.

# cat iscsi-pool.xml 
<pool type='iscsi'>
       <name>iscsi</name>
       <source>
         <host name="10.66.4.201"/>
         <device path="iqn.yy:server.host1"/>
       </source>
       <target>
         <path>/dev/disk/by-path</path>
       </target>
     </pool>

# virsh pool-create iscsi-pool.xml 
Pool iscsi created from iscsi-pool.xml

# iscsiadm -m session
tcp: [7] 10.66.4.201:3260,1 iqn.yy:server.host1 (non-flash)

# cat iscsi-pool-1.xml 
<pool type='iscsi'>
       <name>iscsi-1</name>
       <source>
         <host name="haha"/>   ---> invalid
         <device path="iqn.yy:server.host1"/>
       </source>
       <target>
         <path>/dev/disk/by-path</path>
       </target>
     </pool>

# virsh pool-create iscsi-pool-1.xml 
Pool iscsi-1 created from iscsi-pool-1.xml

# virsh pool-list --all
 iscsi                active     no        
 iscsi-1              active     no

# virsh pool-destroy iscsi-1
error: Failed to destroy pool iscsi-1
error: internal error: Child process (iscsiadm --mode node --portal haha:3260,1 --targetname iqn.yy:server.host1 --logout) unexpected exit status 21: 2014-12-09 10:02:25.064+0000: 6771: debug : virFileClose:99 : Closed fd 24
2014-12-09 10:02:25.064+0000: 6771: debug : virFileClose:99 : Closed fd 26
2014-12-09 10:02:25.064+0000: 6771: debug : virFileClose:99 : Closed fd 22
iscsiadm: No matching sessions found

Comment 2 John Ferlan 2015-03-31 23:59:35 UTC
The duplicate devices is the first thing checked via a call to virStoragePoolSourceFindDuplicateDevices during the call to virStoragePoolSourceFindDuplicate, so to a degree comment 1 is no different than the primary problem description.

However, there is a difference... If the 'iscsi' pool were not active, then an attempt to create/start iscsi-1 (eg. haha) would fail with something like:

error: Failed to create pool from haha-iscsi-net-pool.xml
error: internal error: Child process (iscsiadm --mode discovery --type sendtargets --portal haha:3260,1) unexpected exit status 21: iscsiadm: Cannot resolve host haha. getaddrinfo error: [Name or service not known]

iscsiadm: cannot resolve host name haha
iscsiadm: cannot resolve host name haha
iscsiadm: Cannot resolve host haha. getaddrinfo error: [Name or service not known]

iscsiadm: cannot resolve host name haha
iscsiadm: cannot resolve host name haha
iscsiadm: No portals found


The reason why it works otherwise is the backend iscsi driver code 'assumes' that the front end configuration code has made host checks already... The backend finds the devices already and has the pool attach to them - there isn't a portal for haha created... 

So as part of these changes I'll probably add a check for the undefined name so we don't get into this conundrum.

After creating a prototype that handles the issue I started looking at other similar drivers (nfs, gluster, rbd, sheepdog) and thinking they could have a similar problem.  For sure NFS only checks the <host 'name'> attrbute - since gluster borrows the netfs pool type, it too would only compare the name attribute.  The rbd and sheepdog checks aren't even done.  I don't have a config to check those though...

I'd post patches for this problem, but I'm trying to figure out a way to include bug 1188463 logic in. For that bug, it's two different ip address families resolving to the same host. It's solveable, but kind of ugly.

Comment 3 John Ferlan 2015-05-12 20:25:22 UTC
*** Bug 1188463 has been marked as a duplicate of this bug. ***

Comment 4 John Ferlan 2015-05-12 20:39:39 UTC
After a couple of tries to try to "fix" this by adding checks into libvirt for duplicate host names - it was decided to remove the duplicate host name checks for iSCSI and force the duplicate source target on primarily the IQN (target path).  The following was pushed:

commit 4b2b53f674f6b831c842a89b4e96aa80b2cb1d92
Author: John Ferlan <jferlan>
Date:   Mon May 11 13:51:04 2015 -0400

    conf: Remove source host name check for iSCSI
    
...
git  describe 4b2b53f674f6b831c842a89b4e96aa80b2cb1d92
v1.2.15-68-g4b2b53f


Additionally, jtomko pushed the following:

commit a41b1f196c3495385790e95706e5c83cf2056121
Author: Ján Tomko <jtomko>
Date:   Wed Apr 29 14:59:08 2015 +0200

    iscsi: do not fail to stop a stopped pool
    
    Just as we allow stopping filesystem pools when they were unmounted
    externally, do not fail to stop an iscsi pool when someone else
    closed the session externally.
    
    Reported at:
    https://bugzilla.redhat.com/show_bug.cgi?id=1171984

to handle the shutdown/stop issue.

For more "history" the following are the series posted:

v2:
http://www.redhat.com/archives/libvir-list/2015-April/msg01197.html

v1:
http://www.redhat.com/archives/libvir-list/2015-April/msg00873.html

v1 7/7 (discussion of the name resolution issue)
http://www.redhat.com/archives/libvir-list/2015-April/msg00880.html

In the long run the need and trust of name resolution just to perhaps allow the same IQN (target path) to be used for two different pools on two different hosts was just not possible as there's no guarantee which one we'd get from iscsid.  Also, the iscsid has code to resolve the host name or IP Address provided and compare against a list of known addresses for the host which has the IQN (see output of 'iscsiadm --mode node'. So I have to assume it too would balk at the same IQN being found on two really different servers.

Comment 6 yisun 2015-07-22 09:03:09 UTC
PASS
verified on libvirt-1.2.17-2.el7.x86_64


according to comment 4, there are two verification point:
1. libvirt will check duplicate by IQN but not host name. 
2. When iscsi session closed outside libvirt, we should still be able to stop iscsi pool. 


Scenario 1: verify the IQN duplicate check
1. # cat pool.iscsi 
<pool type='iscsi'>
       <name>iscsi</name>
       <source>
         <host name="x.x.x.x"/>
         <device path="iqn.2014-12.com.redhat:libvirt.shyu-qe-consumption-auto"/>
       </source>
       <target>
         <path>/dev/disk/by-path</path>
       </target>
</pool>

2. # virsh pool-create pool.iscsi 
Pool iscsi created from pool.iscsi

3. # virsh vol-list iscsi
 Name                 Path                                    
------------------------------------------------------------------------------
 unit:0:0:1           /dev/disk/by-path/ip-x.x.x.x:3260-iscsi-iqn.2014-12.com.redhat:libvirt.shyu-qe-consumption-auto-lun-1

4. # cat pool.iscsi2 
<pool type='iscsi'>
       <name>iscsi2</name>
       <source>
         <host name="y.y.y.y"/> <==== changed to another host
         <device path="iqn.2014-12.com.redhat:libvirt.shyu-qe-consumption-auto"/>
       </source>
       <target>
         <path>/dev/disk/by-path</path>
       </target>
</pool>

5. # virsh pool-define pool.iscsi2   <=== failed as mentioned in comment 4
error: Failed to define pool from pool.iscsi2
error: operation failed: Storage source conflict with pool: 'iscsi'
# virsh pool-create pool.iscsi2   <=== failed as mentioned in comment 4
error: Failed to create pool from pool.iscsi2
error: operation failed: Storage source conflict with pool: 'iscsi'


Scenario 2: check if iscsi pool can be destroyed when iscsi session closed outside libvirt. 
1.  make sure iscsi pool still active after scenario 1
#virsh pool-list --all
 Name                 State      Autostart 
-------------------------------------------
...       
 iscsi                active     no  

2. manually logout iscsi portal
# iscsiadm --mode node --portal x.x.x.x:3260,1 --targetname iqn.2014-12.com.redhat:libvirt.shyu-qe-consumption-auto --logout
Logging out of session [sid: 2, target: iqn.2014-12.com.redhat:libvirt.shyu-qe-consumption-auto, portal: x.x.x.x,3260]
Logout of [sid: 2, target: iqn.2014-12.com.redhat:libvirt.shyu-qe-consumption-auto, portal: x.x.x.x,3260] successful.

3. # virsh pool-destroy iscsi
Pool iscsi destroyed  <==== successfully destroyed

4. # virsh pool-list --all | grep iscsi
<==== nothing shows up

Comment 8 errata-xmlrpc 2015-11-19 05:57:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2202.html