Bug 785779

Summary: vdsm: cannot add SD from several luns when there is more then one host in cluster
Product: [Retired] oVirt Reporter: Dafna Ron <dron>
Component: vdsmAssignee: Dan Kenigsberg <danken>
Status: CLOSED DUPLICATE QA Contact: yeylon <yeylon>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: abaron, acathrow, amureini, bazulay, iheim, mgoldboi, rvaknin, srevivo, ykaul
Target Milestone: ---Keywords: Regression
Target Release: 3.3.4   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 766281 Environment:
Last Closed: 2012-05-09 10:11:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 766281    
Bug Blocks:    

Description Dafna Ron 2012-01-30 15:46:40 UTC
+++ This bug was initially created as a clone of Bug #766281 +++



logs for ovirt can be found in bug https://bugzilla.redhat.com/show_bug.cgi?id=769201

==============================================================================

Created attachment 545242 [details]
vdsm and backend logs

Versions:
Red Hat Enterprise Linux Server release 6.2 (Santiago)
vdsm-4.9-112.1.el6.x86_64

Scenario (reproduction steps):
1. Add lun to an existing storage server on Data Center with at least 2 hosts.
2. The backend sends connectStorageServer to the HSM which returns successfully, then sends it sends immediately getDevicesVisibility which returns with False since multipath didn't have the time to map the new lun, hence the backend sends disconnectStorageServer.

From vdsm log:
Thread-311::INFO::2011-12-11 13:57:04,489::dispatcher::94::Storage.Dispatcher.Protect::(run) Run and protect: connectStorageServer, args: (domType=3, spUUID=4996348b-0199-435a-b01d-94cdf67d2d83, conList=[{'connection': '10.35.64.25', 'iqn': 'rvaknin_edu2', 'portal': '1', 'user': '', 'password': '******', 'id': '00000000-0000-0000-0000-000000000000', 'port': '3260'}])
Thread-311::DEBUG::2011-12-11 13:57:04,490::task::495::TaskManager.Task::(_debug) Task 47f42839-13eb-4022-b6a0-62bc59e9a42c: moving from state init -> state preparing
Thread-311::INFO::2011-12-11 13:57:04,490::storage_connection::95::Storage.ServerConnection::(connect) Request to connect ISCSI storage server
Thread-311::INFO::2011-12-11 13:57:04,491::storage_connection::53::Storage.ServerConnection::(__validateConnectionParams) conList=[{'connection': '10.35.64.25', 'iqn': 'rvaknin_edu2', 'portal': '1', 'user': '', 'password': '******', 'id': '00000000-0000-0000-0000-000000000000', 'port': '3260'}]
Thread-311::DEBUG::2011-12-11 13:57:04,491::iscsi::373::Storage.Misc.excCmd::(addiSCSIPortal) '/usr/bin/sudo -n /sbin/iscsiadm -m discoverydb -t sendtargets -p 10.35.64.25:3260 --discover' (cwd None)
Thread-311::DEBUG::2011-12-11 13:57:04,596::iscsi::373::Storage.Misc.excCmd::(addiSCSIPortal) SUCCESS: <err> = ''; <rc> = 0
Thread-311::DEBUG::2011-12-11 13:57:04,597::iscsi::471::Storage.Misc.excCmd::(addiSCSINode) '/usr/bin/sudo -n /sbin/iscsiadm -m node -T rvaknin_edu2 -l -p 10.35.64.25:3260' (cwd None)
Thread-311::DEBUG::2011-12-11 13:57:05,149::iscsi::471::Storage.Misc.excCmd::(addiSCSINode) SUCCESS: <err> = ''; <rc> = 0
Thread-311::DEBUG::2011-12-11 13:57:05,151::lvm::547::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' got the operation mutex
Thread-311::DEBUG::2011-12-11 13:57:05,151::lvm::549::OperationMutex::(_invalidateAllPvs) Operation 'lvm invalidate operation' released the operation mutex
Thread-311::DEBUG::2011-12-11 13:57:05,152::lvm::559::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' got the operation mutex
Thread-311::DEBUG::2011-12-11 13:57:05,152::lvm::561::OperationMutex::(_invalidateAllVgs) Operation 'lvm invalidate operation' released the operation mutex
Thread-311::DEBUG::2011-12-11 13:57:05,153::lvm::580::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' got the operation mutex
Thread-311::DEBUG::2011-12-11 13:57:05,153::lvm::582::OperationMutex::(_invalidateAllLvs) Operation 'lvm invalidate operation' released the operation mutex
Thread-311::DEBUG::2011-12-11 13:57:05,153::task::495::TaskManager.Task::(_debug) Task 47f42839-13eb-4022-b6a0-62bc59e9a42c: finished: {'statuslist': [{'status': 0, 'id': '00000000-0000-0000-0000-000000000000'}]}
Thread-311::DEBUG::2011-12-11 13:57:05,154::task::495::TaskManager.Task::(_debug) Task 47f42839-13eb-4022-b6a0-62bc59e9a42c: moving from state preparing -> state finished
Thread-311::DEBUG::2011-12-11 13:57:05,154::resourceManager::786::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-311::DEBUG::2011-12-11 13:57:05,155::resourceManager::821::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-311::DEBUG::2011-12-11 13:57:05,155::task::495::TaskManager.Task::(_debug) Task 47f42839-13eb-4022-b6a0-62bc59e9a42c: ref 0 aborting False
Thread-311::INFO::2011-12-11 13:57:05,156::dispatcher::100::Storage.Dispatcher.Protect::(run) Run and protect: connectStorageServer, Return response: {'status': {'message': 'OK', 'code': 0}, 'statuslist': [{'status': 0, 'id': '00000000-0000-0000-0000-000000000000'}]}
Thread-313::DEBUG::2011-12-11 13:57:05,166::clientIF::226::Storage.Dispatcher.Protect::(wrapper) [10.35.97.39]
Thread-313::INFO::2011-12-11 13:57:05,167::dispatcher::94::Storage.Dispatcher.Protect::(run) Run and protect: getDevicesVisibility, args: ( guids=['1rvaknin_edu2'])
Thread-313::DEBUG::2011-12-11 13:57:05,168::task::495::TaskManager.Task::(_debug) Task 6b296b5e-2943-4ab9-8923-0228838a564e: moving from state init -> state preparing
Thread-313::DEBUG::2011-12-11 13:57:05,168::task::495::TaskManager.Task::(_debug) Task 6b296b5e-2943-4ab9-8923-0228838a564e: finished: {'visible': {'1rvaknin_edu2': False}}
Thread-313::DEBUG::2011-12-11 13:57:05,169::task::495::TaskManager.Task::(_debug) Task 6b296b5e-2943-4ab9-8923-0228838a564e: moving from state preparing -> state finished
Thread-313::DEBUG::2011-12-11 13:57:05,169::resourceManager::786::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-313::DEBUG::2011-12-11 13:57:05,170::resourceManager::821::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-313::DEBUG::2011-12-11 13:57:05,170::task::495::TaskManager.Task::(_debug) Task 6b296b5e-2943-4ab9-8923-0228838a564e: ref 0 aborting False
Thread-313::INFO::2011-12-11 13:57:05,171::dispatcher::100::Storage.Dispatcher.Protect::(run) Run and protect: getDevicesVisibility, Return response: {'status': {'message': 'OK', 'code': 0}, 'visible': {'1rvaknin_edu2': False}}
Thread-314::DEBUG::2011-12-11 13:57:05,183::clientIF::226::Storage.Dispatcher.Protect::(wrapper) [10.35.97.39]
Thread-314::INFO::2011-12-11 13:57:05,184::dispatcher::94::Storage.Dispatcher.Protect::(run) Run and protect: disconnectStorageServer, args: (domType=3, spUUID=4996348b-0199-435a-b01d-94cdf67d2d83, conList=[{'connection': '10.35.64.25', 'iqn': 'rvaknin_edu2', 'portal': '1', 'user': '', 'password': '******', 'id': '00000000-0000-0000-0000-000000000000', 'port': '3260'}])


From the backend's log:
2011-12-11 13:57:04,483 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (http-0.0.0.0-8080-2) START, ConnectStorageServerVDSCommand(vdsId = 1690f03a-21b6-11e1-83e4-83b4cb727607, storagePoolId = 4996348b-0199-435a-b01d-94cdf67d2d83, storageType = ISCSI, connectionList = [{ id: null, connection: 10.35.64.25 };]), log id: 166f1f94
2011-12-11 13:57:05,157 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (http-0.0.0.0-8080-2) FINISH, ConnectStorageServerVDSCommand, return: {00000000-0000-0000-0000-000000000000=0}, log id: 166f1f94
2011-12-11 13:57:05,161 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (http-0.0.0.0-8080-2) START, GetDevicesVisibilityVDSCommand(vdsId = 1690f03a-21b6-11e1-83e4-83b4cb727607, devicesIds=[1rvaknin_edu2]), log id: 21372d40
2011-12-11 13:57:05,172 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDevicesVisibilityVDSCommand] (http-0.0.0.0-8080-2) FINISH, GetDevicesVisibilityVDSCommand, return: {1rvaknin_edu2=false}, log id: 21372d40
2011-12-11 13:57:05,178 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (http-0.0.0.0-8080-2) START, DisconnectStorageServerVDSCommand(vdsId = 1690f03a-21b6-11e1-83e4-83b4cb727607, storagePoolId = 4996348b-0199-435a-b01d-94cdf67d2d83, storageType = ISCSI, connectionList = [{ id: null, connection: 10.35.64.25 };]), log id: 577b558
2011-12-11 13:57:07,269 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (http-0.0.0.0-8080-2) FINISH, DisconnectStorageServerVDSCommand, return: {00000000-0000-0000-0000-000000000000=0}, log id: 577b558
2011-12-11 13:57:07,275 ERROR [org.ovirt.engine.core.bll.storage.ConnectAllHostsToLunCommand] (http-0.0.0.0-8080-2) Transaction rolled-back for command: org.ovirt.engine.core.bll.storage.ConnectAllHostsToLunCommand.
2011-12-11 13:57:07,285 WARN  [org.ovirt.engine.core.bll.storage.ExtendSANStorageDomainCommand] (http-0.0.0.0-8080-2) CanDoAction of action ExtendSANStorageDomain failed. Reasons:VAR__TYPE__STORAGE__DOMAIN,VAR__ACTION__EXTEND,ERROR_CANNOT_EXTEND_CONNECTION_FAILED

--- Additional comment from pm-rhel on 2011-12-11 08:32:42 EST ---

Since this issue was entered in bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

Comment 1 Dafna Ron 2012-01-30 15:48:00 UTC
logs for ovirt can be found in https://bugzilla.redhat.com/show_bug.cgi?id=782432

Comment 2 Dan Kenigsberg 2012-05-09 10:11:02 UTC
I do not see any benefit in tracking this both in RHEL and in oVirt.

*** This bug has been marked as a duplicate of bug 766281 ***