Bug 766281

Summary: 3.1 - Extend storage domain sometimes fails when there is more than 1 host in DC
Product: Red Hat Enterprise Linux 6 Reporter: Rami Vaknin <rvaknin>
Component: vdsmAssignee: Yeela Kaplan <ykaplan>
Status: CLOSED ERRATA QA Contact: Leonid Natapov <lnatapov>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.3CC: abaron, aburden, bazulay, danken, dron, dyasny, iheim, ilvovsky, jbiddle, lpeer, mgoldboi, mkenneth, Rhev-m-bugs, sputhenp, yeylon, ykaul, zdover
Target Milestone: betaKeywords: Regression, Reopened, ZStream
Target Release: 6.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: vdsm-4.9.6-37.0 Doc Type: Bug Fix
Doc Text:
Previously, extending a storage domain would occasionally fail when more than one host was present in the data center. This was due to insufficient time to map the new LUN before getDevicesVisibility returned 'false' and the backend would disconnect the storage server. Now, the storage in getDevicesVisibility is refreshed using forceIScsiRescan and it is possible to extend a storage domain in data centers containing more than one host.
Story Points: ---
Clone Of:
: 785779 (view as bug list) Environment:
Last Closed: 2012-12-04 18:51:55 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 844180    
Bug Blocks: 785779    
Attachments:
Description Flags
vdsm and backend logs
none
logs
none
logs with sleep none

Description Rami Vaknin 2011-12-11 13:14:43 UTC
Created attachment 545242 [details]
vdsm and backend logs

Versions:
Red Hat Enterprise Linux Server release 6.2 (Santiago)
vdsm-4.9-112.1.el6.x86_64

Scenario (reproduction steps):
1. Add lun to an existing storage server on Data Center with at least 2 hosts.
2. The backend sends connectStorageServer to the HSM which returns successfully, then sends it sends immediately getDevicesVisibility which returns with False since multipath didn't have the time to map the new lun, hence the backend sends disconnectStorageServer.

From vdsm log:
Thread-311::INFO::2011-12-11 13:57:04,489::dispatcher::94::Storage.Dispatcher.Protect::(run) Run and protect: connectStorageServer, args: (domType=3, spUUID=4996348b-0199-435a-b01d-94cdf67d2d83, conList=[{'connection': '10.35.64.25', 'iqn': 'rvaknin_edu2', 'portal': '1', 'user': '', 'password': '******', 'id': '00000000-0000-0000-0000-000000000000', 'port': '3260'}])
Thread-311::DEBUG::2011-12-11 13:57:04,490::task::495::TaskManager.Task::(_debug) Task 47f42839-13eb-4022-b6a0-62bc59e9a42c: moving from state init -> state preparing
Thread-311::INFO::2011-12-11 13:57:04,490::storage_connection::95::Storage.ServerConnection::(connect) Request to connect ISCSI storage server
Thread-311::INFO::2011-12-11 13:57:04,491::storage_connection::53::Storage.ServerConnection::(__validateConnectionParams) conList=[{'connection': '10.35.64.25', 'iqn': 'rvaknin_edu2', 'portal': '1', 'user': '', 'password': '******', 'id': '00000000-0000-0000-0000-000000000000', 'port': '3260'}]
Thread-311::DEBUG::2011-12-11 13:57:04,491::iscsi::373::Storage.Misc.excCmd::(addiSCSIPortal) '/usr/bin/sudo -n /sbin/iscsiadm -m discoverydb -t sendtargets -p 10.35.64.25:3260 --discover' (cwd None)
Thread-311::DEBUG::2011-12-11 13:57:04,596::iscsi::373::Storage.Misc.excCmd::(addiSCSIPortal) SUCCESS: <err> = ''; <rc> = 0
Thread-311::DEBUG::2011-12-11 13:57:04,597::iscsi::471::Storage.Misc.excCmd::(addiSCSINode) '/usr/bin/sudo -n /sbin/iscsiadm -m node -T rvaknin_edu2 -l -p 10.35.64.25:3260' (cwd None)
Thread-311::DEBUG::2011-12-11 13:57:05,149::iscsi::471::Storage.Misc.excCmd::(addiSCSINode) SUCCESS: <err> = ''; <rc> = 0
Thread-311::DEBUG::2011-12-11 13:57:05,151::lvm::547::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' got the operation mutex
Thread-311::DEBUG::2011-12-11 13:57:05,151::lvm::549::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' released the operation mutex
Thread-311::DEBUG::2011-12-11 13:57:05,152::lvm::559::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' got the operation mutex
Thread-311::DEBUG::2011-12-11 13:57:05,152::lvm::561::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' released the operation mutex
Thread-311::DEBUG::2011-12-11 13:57:05,153::lvm::580::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' got the operation mutex
Thread-311::DEBUG::2011-12-11 13:57:05,153::lvm::582::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' released the operation mutex
Thread-311::DEBUG::2011-12-11 13:57:05,153::task::495::TaskManager.Task::(_debug) Task 47f42839-13eb-4022-b6a0-62bc59e9a42c: finished: {'statuslist': [{'status': 0, 'id': '00000000-0000-0000-0000-000000000000'}]}
Thread-311::DEBUG::2011-12-11 13:57:05,154::task::495::TaskManager.Task::(_debug) Task 47f42839-13eb-4022-b6a0-62bc59e9a42c: moving from state preparing -> state finished
Thread-311::DEBUG::2011-12-11 13:57:05,154::resourceManager::786::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-311::DEBUG::2011-12-11 13:57:05,155::resourceManager::821::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-311::DEBUG::2011-12-11 13:57:05,155::task::495::TaskManager.Task::(_debug) Task 47f42839-13eb-4022-b6a0-62bc59e9a42c: ref 0 aborting False
Thread-311::INFO::2011-12-11 13:57:05,156::dispatcher::100::Storage.Dispatcher.Protect::(run) Run and protect: connectStorageServer, Return response: {'status': {'message': 'OK', 'code': 0}, 'statuslist': [{'status': 0, 'id': '00000000-0000-0000-0000-000000000000'}]}
Thread-313::DEBUG::2011-12-11 13:57:05,166::clientIF::226::Storage.Dispatcher.Protect::(wrapper) [10.35.97.39]
Thread-313::INFO::2011-12-11 13:57:05,167::dispatcher::94::Storage.Dispatcher.Protect::(run) Run and protect: getDevicesVisibility, args: ( guids=['1rvaknin_edu2'])
Thread-313::DEBUG::2011-12-11 13:57:05,168::task::495::TaskManager.Task::(_debug) Task 6b296b5e-2943-4ab9-8923-0228838a564e: moving from state init -> state preparing
Thread-313::DEBUG::2011-12-11 13:57:05,168::task::495::TaskManager.Task::(_debug) Task 6b296b5e-2943-4ab9-8923-0228838a564e: finished: {'visible': {'1rvaknin_edu2': False}}
Thread-313::DEBUG::2011-12-11 13:57:05,169::task::495::TaskManager.Task::(_debug) Task 6b296b5e-2943-4ab9-8923-0228838a564e: moving from state preparing -> state finished
Thread-313::DEBUG::2011-12-11 13:57:05,169::resourceManager::786::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-313::DEBUG::2011-12-11 13:57:05,170::resourceManager::821::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-313::DEBUG::2011-12-11 13:57:05,170::task::495::TaskManager.Task::(_debug) Task 6b296b5e-2943-4ab9-8923-0228838a564e: ref 0 aborting False
Thread-313::INFO::2011-12-11 13:57:05,171::dispatcher::100::Storage.Dispatcher.Protect::(run) Run and protect: getDevicesVisibility, Return response: {'status': {'message': 'OK', 'code': 0}, 'visible': {'1rvaknin_edu2': False}}
Thread-314::DEBUG::2011-12-11 13:57:05,183::clientIF::226::Storage.Dispatcher.Protect::(wrapper) [10.35.97.39]
Thread-314::INFO::2011-12-11 13:57:05,184::dispatcher::94::Storage.Dispatcher.Protect::(run) Run and protect: disconnectStorageServer, args: (domType=3, spUUID=4996348b-0199-435a-b01d-94cdf67d2d83, conList=[{'connection': '10.35.64.25', 'iqn': 'rvaknin_edu2', 'portal': '1', 'user': '', 'password': '******', 'id': '00000000-0000-0000-0000-000000000000', 'port': '3260'}])


From the backend's log:
2011-12-11 13:57:04,483 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (http-0.0.0.0-8080-2) START, ConnectStorageServerVDSCommand(vdsId = 1690f03a-21b6-11e1-83e4-83b4cb727607, storagePoolId = 4996348b-0199-435a-b01d-94cdf67d2d83, storageType = ISCSI, connectionList = [{ id: null, connection: 10.35.64.25 };]), log id: 166f1f94
2011-12-11 13:57:05,157 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (http-0.0.0.0-8080-2) FINISH, ConnectStorageServerVDSCommand, return: {00000000-0000-0000-0000-000000000000=0}, log id: 166f1f94
2011-12-11 13:57:05,161 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (http-0.0.0.0-8080-2) START, GetDevicesVisibilityVDSCommand(vdsId = 1690f03a-21b6-11e1-83e4-83b4cb727607, devicesIds=[1rvaknin_edu2]), log id: 21372d40
2011-12-11 13:57:05,172 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (http-0.0.0.0-8080-2) FINISH, GetDevicesVisibilityVDSCommand, return: {1rvaknin_edu2=false}, log id: 21372d40
2011-12-11 13:57:05,178 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (http-0.0.0.0-8080-2) START, DisconnectStorageServerVDSCommand(vdsId = 1690f03a-21b6-11e1-83e4-83b4cb727607, storagePoolId = 4996348b-0199-435a-b01d-94cdf67d2d83, storageType = ISCSI, connectionList = [{ id: null, connection: 10.35.64.25 };]), log id: 577b558
2011-12-11 13:57:07,269 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (http-0.0.0.0-8080-2) FINISH, DisconnectStorageServerVDSCommand, return: {00000000-0000-0000-0000-000000000000=0}, log id: 577b558
2011-12-11 13:57:07,275 ERROR [org.ovirt.engine.core.bll.storage.ConnectAllHostsToLunCommand] (http-0.0.0.0-8080-2) Transaction rolled-back for command: org.ovirt.engine.core.bll.storage.ConnectAllHostsToLunCommand.
2011-12-11 13:57:07,285 WARN  [org.ovirt.engine.core.bll.storage.ExtendSANStorageDomainCommand] (http-0.0.0.0-8080-2) CanDoAction of action ExtendSANStorageDomain failed. Reasons:VAR__TYPE__STORAGE__DOMAIN,VAR__ACTION__EXTEND,ERROR_CANNOT_EXTEND_CONNECTION_FAILED

Comment 3 Dafna Ron 2012-02-26 09:49:36 UTC
reproduced with vdsm-4.9-112.7.el6_2.x86_64

adding new domain from multiple luns will move hsm host to non-operational because disconnectStorageServer is sent on targets. 

hsm: 

[root@orange-vdsd ~]# iscsiadm -m session
tcp: [1] 10.35.64.10:3260,1 Dafna-rhev1
tcp: [10] 10.35.64.10:3260,1 Daf-Lun1
tcp: [11] 10.35.64.10:3260,1 Daf-Lun2
tcp: [2] 10.35.64.10:3260,1 Dafna-rhev2

spm:

[root@blond-vdsh ~]# iscsiadm -m session
tcp: [22] 10.35.64.10:3260,1 Daf-Lun10
tcp: [24] 10.35.64.10:3260,1 Daf-Lun3
tcp: [25] 10.35.64.10:3260,1 Daf-Lun4
tcp: [26] 10.35.64.10:3260,1 Daf-Lun5
tcp: [27] 10.35.64.10:3260,1 Daf-Lun6
tcp: [28] 10.35.64.10:3260,1 Daf-Lun7
tcp: [29] 10.35.64.10:3260,1 Daf-Lun8
tcp: [30] 10.35.64.10:3260,1 Daf-Lun9
tcp: [31] 10.35.64.10:3260,1 Daf-Lun1
tcp: [32] 10.35.64.10:3260,1 Daf-Lun2
tcp: [7] 10.35.64.10:3260,1 Daf-Test5
tcp: [8] 10.35.64.10:3260,1 Daf-Test4
tcp: [9] 10.35.64.10:3260,1 Daf-Test3

Comment 4 Dafna Ron 2012-02-26 09:53:16 UTC
Created attachment 565869 [details]
logs

adding logs for vdsm-reg-4.9-112.7.el6_2.x86_64

Comment 6 Dafna Ron 2012-02-27 12:25:59 UTC
the cause of this issue is a race in connectStorageServer - where connectStorageServer returns too fast and device mapper doesn't map the device and vdsm runs lvm and device doesn't exist

Comment 7 Dan Kenigsberg 2012-02-27 21:49:08 UTC
Some say that the cause of this bug is Engine's tendency to disconnect and re-connect to storage.

Dafna, would you verify that this is indeed the race by slowing down connectStorageServer

+++ b/vdsm/storage/hsm.py
@@ -1808,6 +1808,7 @@ class HSM:
         # Connecting new device may change the visible storage domain list
         # so invalidate caches
         sdCache.invalidateStorage()
+        time.sleep(3.141592)
         return dict(statuslist=statusList)

Comment 8 Dafna Ron 2012-02-28 13:36:48 UTC
Created attachment 566306 [details]
logs with sleep

after adding sleep we are still getting the same issue. 
attaching new logs and the amended hsm.py.

Comment 9 Dafna Ron 2012-02-28 16:04:21 UTC
after further investigation the problem that I have is not the race. 
backend is ending to both hosts  /sbin/iscsiadm -m node -T only to 2 out of the 5 luns and since we do a manual connect to the SPM when adding the domain SPM is still logged in but hsm is not. 
opening a new bug for backend

Comment 10 Ayal Baron 2012-02-28 21:13:40 UTC
(In reply to comment #9)
> after further investigation the problem that I have is not the race. 
> backend is ending to both hosts  /sbin/iscsiadm -m node -T only to 2 out of the
> 5 luns 

So why did you open a new bug and not move this bug to backend? why is it still open on vdsm?

Comment 11 Dafna Ron 2012-02-29 09:29:56 UTC
there is still a race that Rami found for vdsm so I am not closing this bug.
my bug, although end point is the same is not the same bug.

Comment 12 Eduardo Warszawski 2012-05-01 14:52:37 UTC
Rami's bug is about a race with the multipath update process.
The restult can be unstable over the time: multipath devices can come and go with connectivity issues or configuration changes in the storage side.
The purpose of the connectStorageServer and later getDeviceVisibility is to know if a particular device is reachable in the host.
THerefore is better to do sucessive pollings and timeout from the engine, since if the device is not reachable the subsequent extend should be avoided.

Comment 14 Dan Kenigsberg 2012-05-09 10:11:02 UTC
*** Bug 785779 has been marked as a duplicate of this bug. ***

Comment 19 RHEL Program Management 2012-07-26 12:56:48 UTC
Quality Engineering Management has reviewed and declined this request.
You may appeal this decision by reopening this request.

Comment 22 Ayal Baron 2012-08-22 11:10:24 UTC
Should be mitigated by: 844180- getDeviceList takes too long due to synchronous scsi-rescan that is taking too long

Comment 23 Rami Vaknin 2012-09-03 20:05:13 UTC
Could you please explain how should it be mitigated by bug that related to scsi-rescan response time?

Comment 24 Ayal Baron 2012-09-06 09:20:30 UTC
(In reply to comment #23)
> Could you please explain how should it be mitigated by bug that related to
> scsi-rescan response time?

The patch for scsi-rescan basically waits a little longer and will make this scenario highly unlikely.

Comment 25 Rami Vaknin 2012-09-06 10:41:22 UTC
getDevicesVisibility does not use forceIScsiScan(), so you mean that connectStorageServer use it?

Comment 26 Ayal Baron 2012-09-06 11:21:48 UTC
(In reply to comment #25)
> getDevicesVisibility does not use forceIScsiScan(), so you mean that
> connectStorageServer use it?

You are correct which means the issue stated in comment #12 is incorrect.
connectStorageServer is irrelevant, if we're using FCP or the like then there will be no connect.

The title is wrong, the scenario is much simpler.
First from looking at comment #1, you're not creating a new domain, you're extending an existing domain.
(domain creation flow does not call getDevicesVisibility anywheren).
Second, when extending a domain in a multi-host environment the non-spm hosts will not 'see' the new LUN in case it was added after the last refresh because no scsi-rescan is performed.

The fix here is that getDevicesVisibility should refresh the storage before checking for existence of the LUN (can do it in error flow only to avoid redundant refreshes).

Comment 27 Dafna Ron 2012-09-06 11:24:56 UTC
you can also create a new domain and create it with multiple luns - that operation will fail as well.

Comment 28 Ayal Baron 2012-09-06 11:30:34 UTC
(In reply to comment #27)
> you can also create a new domain and create it with multiple luns - that
> operation will fail as well.

That is a different issue that indeed should be mitigated by 844180 (create storage domain happens on a single host only, activate storage domain flow calls refreshStoragePool which does refresh the storage)

Comment 30 Rami Vaknin 2012-09-06 12:18:13 UTC
(In reply to comment #26)
> (In reply to comment #25)
> > getDevicesVisibility does not use forceIScsiScan(), so you mean that
> > connectStorageServer use it?
> 
> You are correct which means the issue stated in comment #12 is incorrect.
> connectStorageServer is irrelevant, if we're using FCP or the like then
> there will be no connect.
> 
> The title is wrong, the scenario is much simpler.

Your title is much better.

> First from looking at comment #1, you're not creating a new domain, you're
> extending an existing domain.

Right.

> (domain creation flow does not call getDevicesVisibility anywheren).
> Second, when extending a domain in a multi-host environment the non-spm
> hosts will not 'see' the new LUN in case it was added after the last refresh
> because no scsi-rescan is performed.

scsi-rescan is not related here.

The issue is that getDevicesVisibility just looks at /dev/mapper/ for the newly connected iscsi devices but it happens so fast that device mapper have not mapped them yet.

> 
> The fix here is that getDevicesVisibility should refresh the storage before
> checking for existence of the LUN (can do it in error flow only to avoid
> redundant refreshes).

I don't know what it means to "refresh the storage" but probably any operation here will solve the issue since the devices probably will already be mapped by this small delay.

Comment 31 Ayal Baron 2012-09-11 07:02:07 UTC
(In reply to comment #30)
> (In reply to comment #26)
> > (In reply to comment #25)
> > > getDevicesVisibility does not use forceIScsiScan(), so you mean that
> > > connectStorageServer use it?
> > 
> > You are correct which means the issue stated in comment #12 is incorrect.
> > connectStorageServer is irrelevant, if we're using FCP or the like then
> > there will be no connect.
> > 
> > The title is wrong, the scenario is much simpler.
> 
> Your title is much better.
> 
> > First from looking at comment #1, you're not creating a new domain, you're
> > extending an existing domain.
> 
> Right.
> 
> > (domain creation flow does not call getDevicesVisibility anywheren).
> > Second, when extending a domain in a multi-host environment the non-spm
> > hosts will not 'see' the new LUN in case it was added after the last refresh
> > because no scsi-rescan is performed.
> 
> scsi-rescan is not related here.

without scsi-rescan the devices will not be visible to the system, it won't matter how long we wait.

> 
> The issue is that getDevicesVisibility just looks at /dev/mapper/ for the
> newly connected iscsi devices but it happens so fast that device mapper have
> not mapped them yet.
> 
> > 
> > The fix here is that getDevicesVisibility should refresh the storage before
> > checking for existence of the LUN (can do it in error flow only to avoid
> > redundant refreshes).
> 
> I don't know what it means to "refresh the storage" but probably any
> operation here will solve the issue since the devices probably will already
> be mapped by this small delay.

Comment 32 Yeela Kaplan 2012-09-27 17:30:57 UTC
http://gerrit.ovirt.org/#/c/8256/

Comment 34 Leonid Natapov 2012-10-14 15:03:45 UTC
si20. two hosts in DC. Storage domain successfully extended.

Comment 38 errata-xmlrpc 2012-12-04 18:51:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-1508.html