Bug 1387083 - [RFE] Managing an existing iSCSI storage domain by adding to its luns, extra ips (configuring duplicate paths) will not be reflected in storage _server_connections (engine DB)
Summary: [RFE] Managing an existing iSCSI storage domain by adding to its luns, extra ...
Keywords:
Status: CLOSED DUPLICATE of bug 977379
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.0.3
Hardware: x86_64
OS: Linux
low
high
Target Milestone: ---
: ---
Assignee: Nobody
QA Contact: Shir Fishbain
URL:
Whiteboard:
Depends On: 1413379
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-10-20 03:34 UTC by Germano Veit Michel
Modified: 2023-09-15 01:25 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-05-23 07:48:32 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Database Info (4.00 KB, text/plain)
2016-10-20 03:55 UTC, Germano Veit Michel
no flags Details
Added new target poiting to same LUN (67.51 KB, image/png)
2016-11-01 01:27 UTC, Germano Veit Michel
no flags Details
But doesn't show here (51.58 KB, image/png)
2016-11-01 01:27 UTC, Germano Veit Michel
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3215831 0 None None None 2020-10-05 20:08:38 UTC

Internal Links: 1172354

Description Germano Veit Michel 2016-10-20 03:34:35 UTC
Description of problem:

This is the setup:

     [HOST]                          [STORAGE]
StorageNet0 <---192.168.100.0/24---> Nic0
StorageNet1 <---192.168.110.0/24---> Nic1
StorageNet2 <---192.168.120.0/24---> Nic2
StorageNet3 <---192.168.130.0/24---> Nic3

These are 4 isolated networks (same infra, on different vlans).

Initially, only the target on 192.168.100.0/24 network is discovered, so the host sees the Storage Domain like this:

36001405b79bc3d5fa9e49c2a0761db8b dm-33 LIO-ORG ,storage2        
size=50G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 27:0:0:0 sdn 8:208 active ready running

tcp: [26] 192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2 (non-flash)


All good.

Now I add the other 3 paths to get to the same target (110, 120 and 130), going to the Storage tab, managing the storage2 domain, and logging in to 110.1, 120.1 and 130.1.

The host now shows this:

36001405b79bc3d5fa9e49c2a0761db8b dm-33 LIO-ORG ,storage2        
size=50G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 30:0:0:0 sdq 65:0  active ready running
|-+- policy='service-time 0' prio=1 status=enabled
| `- 29:0:0:0 sdp 8:240 active ready running
|-+- policy='service-time 0' prio=1 status=enabled
| `- 28:0:0:0 sdo 8:224 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 27:0:0:0 sdn 8:208 active ready running

tcp: [26] 192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2 (non-flash)
tcp: [27] 192.168.110.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2 (non-flash)
tcp: [28] 192.168.120.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2 (non-flash)
tcp: [29] 192.168.130.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2 (non-flash)

All good.

Now I go to Data Center -> iSCSI Multipathing -> Add.

On the "Logical Networks" part, I select StorageNet0, StorageNet1, StorageNet2 and StorageNet3. (192.168.100.0/24, 110.0/24, 120.0/24 and 130.0/24)

On the "Storage Targets" section, I only see the target on 192.168.100.0/24, which is weird. Shouldn't it show the same IQN with 4 entries, one for each IP? Still, I continue and click OK.

Now the Hypervisor logs in 4 more times into 192.168.100.1. And there is no way it's using StorageNet1, StorageNet2 and StorageNet3 for this, because that IP is not reachable via those interfaces.

Result:

36001405b79bc3d5fa9e49c2a0761db8b dm-33 LIO-ORG ,storage2        
size=50G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 31:0:0:0 sdr 65:16 active ready running
|-+- policy='service-time 0' prio=1 status=enabled
| `- 32:0:0:0 sds 65:32 active ready running
|-+- policy='service-time 0' prio=1 status=enabled
| `- 33:0:0:0 sdt 65:48 active ready running
|-+- policy='service-time 0' prio=1 status=enabled
| `- 30:0:0:0 sdq 65:0  active ready running
|-+- policy='service-time 0' prio=1 status=enabled
| `- 29:0:0:0 sdp 8:240 active ready running
|-+- policy='service-time 0' prio=1 status=enabled
| `- 28:0:0:0 sdo 8:224 active ready running
|-+- policy='service-time 0' prio=1 status=enabled
| `- 27:0:0:0 sdn 8:208 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 34:0:0:0 sdu 65:64 active ready running

tcp: [26] 192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2 (non-flash)
tcp: [27] 192.168.110.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2 (non-flash)
tcp: [28] 192.168.120.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2 (non-flash)
tcp: [29] 192.168.130.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2 (non-flash)
tcp: [30] 192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2 (non-flash)
tcp: [31] 192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2 (non-flash)
tcp: [32] 192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2 (non-flash)
tcp: [33] 192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2 (non-flash)


Version-Release number of selected component (if applicable):
ovirt-engine-4.0.4
vdsm-4.18.11-1.el7ev.x86_64

How reproducible:
100%

Actual results:

Host logged into 8 Paths, 5 are duplicate:
# iscsiadm -m session | grep \:storage2 | cut -d ' ' -f 3,4
192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2
192.168.110.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2
192.168.120.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2
192.168.130.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2
192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2
192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2
192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2
192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2

Expected results:

Host logged in 4 Paths.

Additional info:

This is probably the result of that "Storage Targets" section showing only the initial connection target IP.

1) That window must show all 4 paths to be selected
OR
2) It should be smart enough to figure out that the connection IP for the target is NOT 192.168.100.1 via every interface, these are separate networks (and the Storage Domain 

2016-10-19 22:59:24,857 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (org.ovirt.thread.pool-6-thread-47) [1eabbfa7] START, ConnectStorageServerVDSCommand(HostName = RHVH-30, StorageServerConnectionManagementVDSParameters:{runAsync='true', hostId='bf138f1d-8d64-4e4c-be16-cc119faea12b', storagePoolId='00000000-0000-0000-0000-000000000000', storageType='ISCSI', connectionList='

[StorageServerConnections:{id='6c719976-d34c-4095-b5ff-a3a9f6ba4f94', connection='192.168.100.1', iqn='iqn.2003-01.org.linux-iscsi.storage.x8664:storage2', vfsType='null', mountOptions='null', nfsVersion='null', nfsRetrans='null', nfsTimeo='null', iface='eth2.300', netIfaceName='eth2.300'}, 

StorageServerConnections:{id='c43d115e-842a-4536-b8e7-45bea5090756', connection='192.168.100.1', iqn='iqn.2003-01.org.linux-iscsi.storage.x8664:storage2', vfsType='null', mountOptions='null', nfsVersion='null', nfsRetrans='null', nfsTimeo='null', iface='eth3.400', netIfaceName='eth3.400'}, 

StorageServerConnections:{id='153020b9-0f7f-4e82-a6c4-e532bca97d9b', connection='192.168.100.1', iqn='iqn.2003-01.org.linux-iscsi.storage.x8664:storage2', vfsType='null', mountOptions='null', nfsVersion='null', nfsRetrans='null', nfsTimeo='null', iface='eth4.500', netIfaceName='eth4.500'}, 

StorageServerConnections:{id='ce6c3c61-0438-42c5-9e19-51247d900001', connection='192.168.100.1', iqn='iqn.2003-01.org.linux-iscsi.storage.x8664:storage2', vfsType='null', mountOptions='null', nfsVersion='null', nfsRetrans='null', nfsTimeo='null', iface='eth5.600', netIfaceName='eth5.600'}]'}), log id: 544a1bf2

If I remove the iSCSI bond, the host continues logged in into all 8 paths.

If I reboot the host, the number of connections, after activation, falls to 4 again (as expected):

36001405b79bc3d5fa9e49c2a0761db8b dm-33 LIO-ORG ,storage2        
size=50G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 16:0:0:0 sdo 8:224 active ready running
|-+- policy='service-time 0' prio=1 status=enabled
| `- 17:0:0:0 sdp 8:240 active ready running
|-+- policy='service-time 0' prio=1 status=enabled
| `- 18:0:0:0 sdq 65:0  active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 6:0:0:0  sde 8:64  active ready running

However, it's again connecting 4 times using the same network, not 1 connection per network.

# iscsiadm -m session | grep \:storage2 | cut -d ' '  -f3,4
192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2
192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2
192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2
192.168.100.1:3260,1 iqn.2003-01.org.linux-iscsi.storage.x8664:storage2

This matches the 4 new connections created during the bond setup, that's why it got 5 connections to the same IP.

Comment 1 Germano Veit Michel 2016-10-20 03:55:27 UTC
Created attachment 1212322 [details]
Database Info

Some relevant Data from the DB

Comment 2 Maor 2016-10-30 00:13:30 UTC
Hi Germano,

I vaguely remember a similar issue with iSCSI SD which used multiple paths to the same target, I will try to look for it.

Does it only reproduces for you when you use the same IQN with several paths? What happens when you use different IQN for each path?

Comment 3 Germano Veit Michel 2016-10-31 01:57:38 UTC
Hi Maor,

I did setup a different IQN for each path. But now I seems wrong even earlier in the process. 

In "iSCSI Multipathing -> Add", it just shows the initial IQN, not the additional ones, even after logging in into them in the Storage Tab.

If I go ahead, select the 4 Networks and that single IQN, the result is just one path per host, using the first FQDN and network.

Summary:

Same IQN in all 4 Networks:
-> 4 Logins to the same IQN in the same network

Different IQN per Network:
-> 1 Login to the original IQN in the first network, nothing else

The versions are still the same.

Comment 4 Maor 2016-10-31 09:09:16 UTC
(In reply to Germano Veit Michel from comment #3)
> Hi Maor,
> 
> I did setup a different IQN for each path. But now I seems wrong even
> earlier in the process. 
> 
> In "iSCSI Multipathing -> Add", it just shows the initial IQN, not the
> additional ones, even after logging in into them in the Storage Tab.
> 

Once you edit the iSCSI storage domain and adds a new target to it, the connect process might takes some time.
You should wait for an audit log saying:
"Storage {storage_name} has been extended by admin@internal. Please wait for refresh."

Can you try to log in into the targets in the Storage Tab and wait for the audit log (maybe also verify it by trying to manage the storage domain and check that indeed all the LUNs that was added are checked) and only then try to add the new iSCSI bond.

Comment 5 Germano Veit Michel 2016-11-01 01:26:49 UTC
Hi Maor,

Now I am not sure I understand what you are requesting. 

You want me to add N IQNs pointing to the same LUN, right? Or are you suggesting extending the SD by adding N IQNs that point to new LUNs? This second doesn't make much sense to me.

Anyway, the Storage Domain has two LUNs added to it, sucessfully, but it still doesn't show up in the new iSCSI bond dialog, as you can see in the attachments.

Is this what you want me to test?

Thanks

Comment 6 Germano Veit Michel 2016-11-01 01:27:28 UTC
Created attachment 1215960 [details]
Added new target poiting to same LUN

Comment 7 Germano Veit Michel 2016-11-01 01:27:54 UTC
Created attachment 1215961 [details]
But doesn't show here

Comment 8 Germano Veit Michel 2016-11-01 01:30:42 UTC
(In reply to Germano Veit Michel from comment #5)
> Anyway, the Storage Domain has two LUNs added to it, sucessfully, but it
> still doesn't show up in the new iSCSI bond dialog, as you can see in the
> attachments.

Should read "... has two different IQN pointing to the same LUN ..."

That's what I understood from here:
(In reply to Maor from comment #2)
> What happens when you use different IQN for each path?

Comment 9 Maor 2016-11-07 21:47:31 UTC
Hi Germano,

Thanks for the print screens that was much helpful.
Looking at the code it seems that the targets that are represented in the GUI when adding iSCSI multipath are fetched using the postgres function GetStorageConnectionsByStorageTypeAndStatus while the luns and targets that are represented when managing domain are fetched using getDeviceList fetched from VDSM.
I will try to reproduce this on my env.

Meanwhile, if you still have access to the same env can you please share the following output of those sql scripts:

SELECT * from lun_storage_server_connection_map;

SELECT * FROM luns;

SELECT * FROM storage_server_connections ;

Comment 10 Germano Veit Michel 2016-11-08 05:59:05 UTC
Hi Maor,

Hmmm, from what I can see the problem is that it ignores the additional IQN (from the test you suggested) because it maps to the same LUN. So it doesn't show in storage_server_connections table? Could this also be the root cause of the initial issue in the BZ?

Please find the attached logs and latest screenshot.

Thanks,
Germano

Comment 12 Maor 2016-11-08 10:13:20 UTC
Can you please attach the VDSM and engine logs which includes the operation of the connect.
I'm wondering if getDeviceList returned the luns with the other IPs or if it only returns one default ip every time

Comment 14 Germano Veit Michel 2016-12-12 06:06:25 UTC
Hi Maor,

First of all sorry for the huge delay.

I've reproduced this again on a brand new 4.1 ovirt that I just installed to start playing with 4.1. So the problem is still there. I hope you don't mind it being upstream and 4.1.
I'll try to be as clear as possible.

This is the new scenario, which is basically the same as the previous one, but smaller:

   [HOST]                            [STORAGE]
ovirtmgmt <--- 192.168.41.0/24  ---> eth0
  storage <--- 192.168.141.0/24 ---> eth1
 storage2 <--- 192.168.241.0/24 ---> eth2

This is a self contained setup, which doesn't go up by itself (not sure why yet, but it doesn't activate master by itself on poweron), so please IGNORE all logs before:

VDSM:    2016-12-12 15:37:36
ENGINE:  2016-12-12 05:36:17,318Z

After those time the environment is correctly up. Going forward:

1. Initially, the storage domain iSCSI was only addev via ovirtmgmt network.

3600140576174f9732e4489d82eb5ca31 dm-11 LIO-ORG ,storage         
size=100G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 3:0:0:0 sdb 8:16 active ready running

2. The host already has IP address on both storage networks and ovirtmgmt, so does the storage server.

        [HOST]                            [STORAGE]
 192.168.41.20 <--- 192.168.41.0/24  ---> 192.168.41.2
192.168.141.20 <--- 192.168.141.0/24 ---> 192.168.141.2
192.168.241.20 <--- 192.168.241.0/24 ---> 192.168.241.2

3. Storage Server has 2 IQNs for the same LUN:

  o- iqn.2003-01.rhev.storage:storage.storage ..................................... [TPGs: 1]
  | o- tpg1 ............................................................. [gen-acls, no-auth]
  |   o- acls ..................................................................... [ACLs: 2]
  |   o- luns ..................................................................... [LUNs: 1]
  |   | o- lun0 ........................................[fileio/storage (/var/iscsi/storage)]
  |   o- portals ............................................................... [Portals: 1]
  |     o- 0.0.0.0:3260 ................................................................ [OK]
  o- iqn.2003-01.rhev.storage:storage.storage.alt ................................. [TPGs: 1]
    o- tpg1 .......................................................... [no-gen-acls, no-auth]
      o- acls ..................................................................... [ACLs: 2]
      o- luns ..................................................................... [LUNs: 1]
      | o- lun0 ....................................... [fileio/storage (/var/iscsi/storage)]
      o- portals ............................................................... [Portals: 1]

4. I discover the storage.storage.alt target via Admin Portal

* I wanted to discover storage.storage via 141.2 (it's already discovered via 41.2).
    Problem: I cannot discover storage.storage via a different IP address, it simply ignores it.
    Workaround: I add another target, alt2, as below

  o- iqn.2003-01.rhev.storage:storage.storage.alt2 ,,,............................. [TPGs: 1]
    o- tpg1 .......................................................... [no-gen-acls, no-auth]
      o- acls ..................................................................... [ACLs: 2]
      o- luns ..................................................................... [LUNs: 1]
      | o- lun0 ....................................... [fileio/storage (/var/iscsi/storage)]
      o- portals ............................................................... [Portals: 1]
        o- 0.0.0.0:3260 ................................................................ [OK]

So now the idea is:

        [HOST]                            [STORAGE]
 192.168.41.20 <--- 192.168.41.0/24  ---> 192.168.41.2    [storage.storage] * 
192.168.141.20 <--- 192.168.141.0/24 ---> 192.168.141.2   [storage.storage.alt2]
192.168.241.20 <--- 192.168.241.0/24 ---> 192.168.241.2   [storage.storage.alt]

iqn.2003-01.rhev.storage:storage.storage      is via 192.168.41.2 *
iqn.2003-01.rhev.storage:storage.storage.alt2 is via 192.168.141.2
iqn.2003-01.rhev.storage:storage.storage.alt  is via 192.168.241.2

* Won't be part of iscsi multipath setup, it's a required network.

Result: see screenshot1.png

5. In the Data Center tab, I configure iSCSI Multipathing (see screenshot2.png)
   * Here the alternative storage.storage.alt is not showing up, as it should

6. Anyway, I got ahead and click OK. This is the result:

In the host (H2), we have 3 connections to the same target again
tcp: [2] 192.168.41.2:3260,1 iqn.2003-01.rhev.storage:storage.storage (non-flash)
tcp: [3] 192.168.41.2:3260,1 iqn.2003-01.rhev.storage:storage.storage (non-flash)
tcp: [4] 192.168.41.2:3260,1 iqn.2003-01.rhev.storage:storage.storage (non-flash)

[2] is the original connection (via ovirtmgmt)
[3] and [4] were after I clicked OK in screenshot2.

3600140576174f9732e4489d82eb5ca31 dm-11 LIO-ORG ,storage         
size=100G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 4:0:0:0 sdc 8:32 active ready running
|-+- policy='service-time 0' prio=1 status=enabled
| `- 5:0:0:0 sdd 8:48 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 3:0:0:0 sdb 8:16 active ready running

And on the Host that was used to discover (H1) the alt and alt2 targets, we have 5:

tcp: [2] 192.168.41.2:3260,1 iqn.2003-01.rhev.storage:storage.storage (non-flash)
tcp: [3] 192.168.241.2:3260,1 iqn.2003-01.rhev.storage:storage.storage.alt (non-flash)
tcp: [4] 192.168.141.2:3260,1 iqn.2003-01.rhev.storage:storage.storage.alt2 (non-flash)
tcp: [5] 192.168.41.2:3260,1 iqn.2003-01.rhev.storage:storage.storage (non-flash)
tcp: [6] 192.168.41.2:3260,1 iqn.2003-01.rhev.storage:storage.storage (non-flash)

[2] is the original connection (via ovirtmgmt)
[3] and [4] were created when discovering the alternate targets, not removed later
[5] and [6] were added after I clicked OK in screenshot2.

3600140576174f9732e4489d82eb5ca31 dm-11 LIO-ORG ,storage         
size=100G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 6:0:0:0 sde 8:64 active ready running
|-+- policy='service-time 0' prio=1 status=enabled
| `- 5:0:0:0 sdd 8:48 active ready running
|-+- policy='service-time 0' prio=1 status=enabled
| `- 4:0:0:0 sdc 8:32 active ready running
|-+- policy='service-time 0' prio=1 status=enabled
| `- 3:0:0:0 sdb 8:16 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
  `- 7:0:0:0 sdf 8:80 active ready running

In summary, these are the problems:
1. The alternate targets via different IP address don't show up when configuring iSCSI multipath (screenshot2)
2. Connection via the ovirtmgmt was not removed (192.168.41.0/24) when I set the multipath bond to use just storage and storage2 nets.
3. Host used to login to alternate paths continues logged in (why?) even if it was used to just discover the paths (before multipath configuration)
4. Going ahead with the config and ignoring [1], results in N paths logged in via the same network and to the same iqn.

What I actually wanted was this:

        [HOST]                            [STORAGE]
192.168.141.20 <--- 192.168.141.0/24 ---> 192.168.141.2   [storage.storage.alt2]
192.168.241.20 <--- 192.168.241.0/24 ---> 192.168.241.2   [storage.storage.alt]

So two connections per host, via storage and storage2 networks. Nothing else.

I'm attaching the logs. I hope this clarifies it.

If you want to split this in different bugs please let me know.

Cheers,

Comment 21 Maor 2016-12-12 12:34:12 UTC
Thanks for the logs and the print screens.
Though I'm not sure if the engine log is synced with the VDSM logs.
VDSM logs are from 2016-12-12 around 14:00 while the engine logs ends at 2016-12-12 05:00.
Can you please share the full engine log.

Going over the logs till now it looks that two of the targets encountered problems (10.35.141.1 and 10.35.141.1)
It also seems that LVMVolumeGroup.getInfo always returnning 192.168.41.2 (see [1])

10.35.141.1 was not be able to get discovered at all (see [2])
10.35.241.1 does get discovered (see [3] although it encountered a problem to create VG (see [4])

I don't think that there is a problem with the iSCSI bond but in the creation of the new storage domain process since the iSCSI bond simply using what is in the storage_server_connections table that gets initialized when creating the new Storage Domain.

While attaching the full engine logs could you also share the output of iscsiadm in your Host for discover and login to those targets (141 and 241)?


[1]
2016-12-12 15:36:16,573 INFO  (jsonrpc/7) [dispatcher] Run and protect: getVGInfo, Return response: {'info': {'state': 'OK', 'vgsize': '106971529216', 'name': 'e0ab4e0d-dd27-4e62-a545-007ed77a4db1', 'vgfree': '74625056768', 'vgUUID': 'z3Epu7-X2Po-n3tK-dyxI-CDGF-ja1d-LIiVWK', 'pvlist': [{'vendorID': 'LIO-ORG', 'capacity': '106971529216', 'fwrev': '0000', 'discard_zeroes_data': 0, 'vgUUID': 'z3Epu7-X2Po-n3tK-dyxI-CDGF-ja1d-LIiVWK', 'pathlist': [{'connection': '192.168.41.2', 'iqn': 'iqn.2003-01.rhev.storage:storage.hosted', 'portal': '1', 'port': '3260', 'initiatorname': 'default'}], 'discard_max_bytes': 0, 'pathstatus': [{'type': 'iSCSI', 'physdev': 'sda', 'capacity': '107374182400', 'state': 'active', 'lun': '0'}], 'devtype': 'iSCSI', 'pvUUID': 'eslcvz-Z2vd-YcBg-8AxT-HhFt-R4QQ-AM3x8E', 'serial': 'SLIO-ORG_hosted_storage_d027231c-25e5-42d1-8020-54d1f144705e', 'GUID': '36001405d027231c25e542d1802054d1f', 'devcapacity': '107374182400', 'productID': 'hosted_storage'}], 'type': 3, 'attr': {'allocation': 'n', 'partial': '-', 'exported': '-', 'permission': 'w', 'clustered': '-', 'resizeable': 'z'}}} (logUtils:52)
2016-12-12 15:36:16,574 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call LVMVolumeGroup.getInfo succeeded in 0.77 seconds (__init__:515)



[2] 
2016-12-12 05:08:11,616Z ERROR [org.ovirt.engine.core.bll.storage.connection.DiscoverSendTargetsQuery] (default task-16) [07a96db4-8d75-465c-8644-a546263c85bf] Query 'DiscoverSendTargetsQuery' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed in vdscommand to DiscoverSendTargetsVDS, error = Failed discovery of iSCSI targets: u"portal=192.168.141.2:3260, err=(21, [], ['iscsiadm: cannot make connection to 192.168.141.2: No route to host', 'iscsiadm: cannot make connection to 192.168.141.2: No route to host', 'iscsiadm: cannot make connection to 192.168.141.2: No route to host', 'iscsiadm: cannot make connection to 192.168.141.2: No route to host', 'iscsiadm: cannot make connection to 192.168.141.2: No route to host', 'iscsiadm: cannot make connection to 192.168.141.2: No route to host', 'iscsiadm: connection login retries (reopen_max) 5 exceeded', 'iscsiadm: No portals found'])" (Failed with error iSCSIDiscoveryError and code 475)
2016-12-12 05:08:11,616Z ERROR [org.ovirt.engine.core.bll.storage.connection.DiscoverSendTargetsQuery] (default task-16) [07a96db4-8d75-465c-8644-a546263c85bf] Exception: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed in vdscommand to DiscoverSendTargetsVDS, error = Failed discovery of iSCSI targets: u"portal=192.168.141.2:3260, err=(21, [], ['iscsiadm: cannot make connection to 192.168.141.2: No route to host', 'iscsiadm: cannot make connection to 192.168.141.2: No route to host', 'iscsiadm: cannot make connection to 192.168.141.2: No route to host', 'iscsiadm: cannot make connection to 192.168.141.2: No route to host', 'iscsiadm: cannot make connection to 192.168.141.2: No route to host', 'iscsiadm: cannot make connection to 192.168.141.2: No route to host', 'iscsiadm: connection login retries (reopen_max) 5 exceeded', 'iscsiadm: No portals found'])" (Failed with error iSCSIDiscoveryError and code 475)



[3]
2016-12-12 14:30:14,059 INFO  (jsonrpc/6) [dispatcher] Run and protect: getDeviceList, Return response: {'devList': [{'status': 'unknown', 'vendorID': 'LIO-ORG', 'capacity': '107374182400', 'fwrev': '4.0', 'discard_zeroes_data': 0, 'vgUUID': 'z3Epu7-X2Po-n3tK-dyxI-CDGF-ja1d-LIiVWK', 'pvsize': '106971529216', 'pathlist': [{'connection': '192.168.41.2', 'iqn': 'iqn.2003-01.rhev.storage:storage.hosted', 'portal': '1', 'port': '3260', 'initiatorname': 'default'}], 'logicalblocksize': '512', 'discard_max_bytes': 0, 'pathstatus': [{'type': 'iSCSI', 'physdev': 'sda', 'capacity': '107374182400', 'state': 'active', 'lun': '0'}], 'devtype': 'iSCSI', 'physicalblocksize': '512', 'pvUUID': 'eslcvz-Z2vd-YcBg-8AxT-HhFt-R4QQ-AM3x8E', 'serial': 'SLIO-ORG_hosted_storage_d027231c-25e5-42d1-8020-54d1f144705e', 'GUID': '36001405d027231c25e542d1802054d1f', 'productID': 'hosted_storage'}, {'status': 'unknown', 'vendorID': 'LIO-ORG', 'capacity': '107374182400', 'fwrev': '4.0', 'discard_zeroes_data': 0, 'vgUUID': '', 'pvsize': '', 'pathlist': [{'connection': '192.168.41.2', 'iqn': 'iqn.2003-01.rhev.storage:storage.storage', 'portal': '1', 'port': '3260', 'initiatorname': 'default'}], 'logicalblocksize': '512', 'discard_max_bytes': 0, 'pathstatus': [{'type': 'iSCSI', 'physdev': 'sdb', 'capacity': '107374182400', 'state': 'active', 'lun': '0'}], 'devtype': 'iSCSI', 'physicalblocksize': '512', 'pvUUID': '', 'serial': 'SLIO-ORG_storage_76174f97-32e4-489d-82eb-5ca31eafc218', 'GUID': '3600140576174f9732e4489d82eb5ca31', 'productID': 'storage'}]} (logUtils:52)
2016-12-12 14:30:14,060 INFO  (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC call Host.getDeviceList succeeded in 0.32 seconds (__init__:515)



[4]
2016-12-12 14:30:24,409 INFO  (jsonrpc/5) [dispatcher] Run and protect: createVG(vgname=u'd0420dfe-c0c5-4cc6-9f3b-3b01ef2f4fff', devlist=[u'3600140576174f9732e4489d82eb5ca31'], force=False, options=None) (logUtils:49)
2016-12-12 14:30:24,599 INFO  (jsonrpc/5) [dispatcher] Run and protect: createVG, Return response: {'uuid': 'h3XrlB-Ih1V-YuwM-rAsg-vHzJ-rauG-MZkYH3'} (logUtils:52)
2016-12-12 14:30:24,600 INFO  (jsonrpc/5) [jsonrpc.JsonRpcServer] RPC call LVMVolumeGroup.create succeeded in 0.19 seconds (__init__:515)
2016-12-12 14:30:25,610 INFO  (jsonrpc/0) [dispatcher] Run and protect: createStorageDomain(storageType=3, sdUUID=u'd0420dfe-c0c5-4cc6-9f3b-3b01ef2f4fff', domainName=u'iSCSI', typeSpecificArg=u'h3XrlB-Ih1V-YuwM-rAsg-vHzJ-rauG-MZkYH3', domClass=1, domVersion=u'4', options=None) (logUtils:49)
2016-12-12 14:30:25,611 ERROR (jsonrpc/0) [storage.StorageDomainCache] looking for unfetched domain d0420dfe-c0c5-4cc6-9f3b-3b01ef2f4fff (sdc:151)
2016-12-12 14:30:25,611 ERROR (jsonrpc/0) [storage.StorageDomainCache] looking for domain d0420dfe-c0c5-4cc6-9f3b-3b01ef2f4fff (sdc:168)
2016-12-12 14:30:25,613 ERROR (jsonrpc/0) [storage.StorageDomainCache] domain d0420dfe-c0c5-4cc6-9f3b-3b01ef2f4fff not found (sdc:157)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/sdc.py", line 155, in _findDomain
    dom = findMethod(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 185, in _findUnfetchedDomain
    raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: (u'd0420dfe-c0c5-4cc6-9f3b-3b01ef2f4fff',)


InaccessiblePhysDev: Multipath cannot access physical device(s): "devices=(u'3600140576174f9732e4489d82eb5ca31',)"

Comment 22 Germano Veit Michel 2016-12-12 23:48:07 UTC
Hi Maor,

I actually had a problem during the test. The storage server somehow got a wrong IP address from the DHCP server on 192.168.141.0/24 network. That's why you see all those failed attempts at [2]. And the host is 10 hours ahead of the manager.

The logs are indeed confusing this way, I should have repeated the test, sorry. I'll cleanup everything, reset the logs, sync the time and upload everything again.

Please hold.

Comment 24 Maor 2017-01-02 21:26:24 UTC
Hi Germano, thanks again for the logs.

It seems from the logs that you had 2 hosts in your setup h2.rhv41.rhev and h1.rhv41.rhev, although only h1.rhv41.rhev appears in the logs and the call for getDeviceList has been done from h2.rhv41.rhev (If you can add its log as well it might help)

From what I saw until now in the DB logs, I think this bug is not related to iSCSI bond but to manage of iSCSI Storage Domain, since the storage domain in the DB does not contain any of the additional ips: 192.168.141.2, 192.168.241.2, so this might be part of the adding/updating storage domain process

Does it only reproduce for you if you edit an existing Storage Domain or does it also reproduce when adding a new Storage Domain?

Comment 25 Maor 2017-01-05 14:15:03 UTC
(In reply to Maor from comment #24)
> Hi Germano, thanks again for the logs.
> 
> It seems from the logs that you had 2 hosts in your setup h2.rhv41.rhev and
> h1.rhv41.rhev, although only h1.rhv41.rhev appears in the logs and the call
> for getDeviceList has been done from h2.rhv41.rhev (If you can add its log
> as well it might help)
> 
> From what I saw until now in the DB logs, I think this bug is not related to
> iSCSI bond but to manage of iSCSI Storage Domain, since the storage domain
> in the DB does not contain any of the additional ips: 192.168.141.2,
> 192.168.241.2, so this might be part of the adding/updating storage domain
> process
> 
> Does it only reproduce for you if you edit an existing Storage Domain or
> does it also reproduce when adding a new Storage Domain?

Hi Germano,

I think I finally succeeded to reproduce the issue, although I only succeeded to reproduce it once I edit an existing Storage Domain and add other ips.
New iSCSI storage domain with multiple ips works fine for me.

Please let me know if that is the scenario of the bug you mentioned, and I will start to work on a fix

Comment 26 Maor 2017-01-05 15:09:41 UTC
(In reply to Maor from comment #25)
> (In reply to Maor from comment #24)
> > Hi Germano, thanks again for the logs.
> > 
> > It seems from the logs that you had 2 hosts in your setup h2.rhv41.rhev and
> > h1.rhv41.rhev, although only h1.rhv41.rhev appears in the logs and the call
> > for getDeviceList has been done from h2.rhv41.rhev (If you can add its log
> > as well it might help)
> > 
> > From what I saw until now in the DB logs, I think this bug is not related to
> > iSCSI bond but to manage of iSCSI Storage Domain, since the storage domain
> > in the DB does not contain any of the additional ips: 192.168.141.2,
> > 192.168.241.2, so this might be part of the adding/updating storage domain
> > process
> > 
> > Does it only reproduce for you if you edit an existing Storage Domain or
> > does it also reproduce when adding a new Storage Domain?
> 
> Hi Germano,
> 
> I think I finally succeeded to reproduce the issue, although I only
> succeeded to reproduce it once I edit an existing Storage Domain and add
> other ips.
> New iSCSI storage domain with multiple ips works fine for me.
> 
> Please let me know if that is the scenario of the bug you mentioned, and I
> will start to work on a fix

Basically, deactivate + activate the storage domain (or moving host to maintenance and activate) should refresh the added networks which were added after the storage domain was already activated.

looking at the code, the storage domain will be updated through VDSM only if luns were added to it or refresh lun size will be called, otherwise the engine will only update the description of the storage domain.

We can add a check for new networks every time a user will finish to manage a storage domain, although I don't really like it considering the latency issue it might reflect...
any other suggestions?

Comment 27 Germano Veit Michel 2017-01-09 05:46:48 UTC
(In reply to Maor from comment #25)
> Hi Germano,
> 
> I think I finally succeeded to reproduce the issue, although I only
> succeeded to reproduce it once I edit an existing Storage Domain and add
> other ips.
> New iSCSI storage domain with multiple ips works fine for me.
> 
> Please let me know if that is the scenario of the bug you mentioned, and I
> will start to work on a fix

Good catch! I only tried editing (as you can see in the BZ description). 

A quick test here also shows that it works if I add it in the first place with multiple IPs.

(In reply to Maor from comment #26)
> Basically, deactivate + activate the storage domain (or moving host to
> maintenance and activate) should refresh the added networks which were added
> after the storage domain was already activated.
> 
> looking at the code, the storage domain will be updated through VDSM only if
> luns were added to it or refresh lun size will be called, otherwise the
> engine will only update the description of the storage domain.
> 
> We can add a check for new networks every time a user will finish to manage
> a storage domain, although I don't really like it considering the latency
> issue it might reflect...
> any other suggestions?

By latency you mean the time it will take to push the new configuration, not any storage related latency problem for the VMs, right? If it's only an engine/admin portal latency, is it really a problem? I believe it's still better than having to put it in maintenance mode as it would lead to downtime.

Comment 28 Germano Veit Michel 2017-01-09 05:48:01 UTC
Oops, just realized that question wasn't for me. Sorry :)

Restoring NEEDINFO

Comment 29 Maor 2017-01-15 15:04:17 UTC
We have a proposed solution for 4.1 which includes automatic log in for targets, although the best solution will be to discuss BZ1413379 RFE

Comment 30 Allon Mureinik 2017-01-16 08:37:48 UTC
The proposed patch would cause even more confusion, and make the system even harder to manage.
Pushing out to 4.2 so the root cause can be fixed (e.g., by bug 1413379).

Comment 32 Yaniv Kaul 2017-08-15 13:43:55 UTC
(In reply to Allon Mureinik from comment #30)
> The proposed patch would cause even more confusion, and make the system even
> harder to manage.
> Pushing out to 4.2 so the root cause can be fixed (e.g., by bug 1413379).

But bug 1413379 is not targeted to 4.2?

Comment 33 Germano Veit Michel 2018-08-23 05:49:42 UTC
Maor, this is blocked on a WONTFIX RFE?

Comment 37 Sandro Bonazzola 2019-01-28 09:41:17 UTC
This bug has not been marked as blocker for oVirt 4.3.0.
Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1.

Comment 46 Marina Kalinin 2020-10-05 20:08:46 UTC
We discussed this today with Storage team and decided it is more of an RFE, since it requires lots of work to implement this functionality and it is dependent on another RFE today - bz#1413379.

Comment 56 Sandro Bonazzola 2022-03-29 16:16:40 UTC
We are past 4.5.0 feature freeze, please re-target.

Comment 57 Michal Skrivanek 2022-05-04 14:15:44 UTC
is multipath functionality covered by any QE tests now?

Comment 62 Arik 2022-05-24 07:42:34 UTC
For whoever gets here, in oVirt 4.5 (RHV 4.4 SP1) we've added a new dialog that enables editing the storage server connections

Comment 64 Red Hat Bugzilla 2023-09-15 01:25:07 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days


Note You need to log in before you can comment on or make changes to this bug.