Hide Forgot
Created attachment 480662 [details] vdsm log Description of problem: topology: -2 storage domains - on 2 nfs servers when disconnecting one of the 2 storage domains, after vdsm is restarted -> rhevm is sending connectStorageServer with the 2 servers connection string together, this causes a timeout on rhevm, and fails the action. Version-Release number of selected component (if applicable): vdsm-4.9-49.el6.x86_64 How reproducible: always Steps to Reproduce: 1.create 2 SDs on 2 different storage servers 2.disconnect the master domain. 3. Actual results: Expected results: Additional info: Thread-27::DEBUG::2011-02-23 16:03:15,075::clientIF::229::Storage.Dispatcher.Protect::(wrapper) [10.35.104.7] Thread-27::INFO::2011-02-23 16:03:15,075::dispatcher::94::Storage.Dispatcher.Protect::(run) Run and protect: connectStorageServer, args: (domType=1, spUUID=07bf7773-b187-4540-bbef-1f3f33d5ceea, conList=[{'connection': 'orion.qa.lab.tlv.redhat.com:/export/mgoldboi/data1', 'iqn': '', 'portal': '', 'user': '', 'password': '******', 'id': '6e9e7a32-2565-4662-ab40-08a3a1c73b3d', 'port': ''}, {'connection': 'qanashead.qa.lab.tlv.redhat.com:/export/mgoldboi/data1', 'iqn': '', 'portal': '', 'user': '', 'password': '******', 'id': '3100cb1a-5ae9-49ee-8450-ea268e24531f', 'port': ''}]) Thread-27::DEBUG::2011-02-23 16:03:15,076::task::491::TaskManager.Task::(_debug) Task 3085df0c-08f2-459b-bc07-18ec7c811c51: moving from state init -> state preparing Thread-27::INFO::2011-02-23 16:03:15,076::storage_connection::83::Storage.ServerConnection::(connect) Request to connect NFS storage server Thread-27::INFO::2011-02-23 16:03:15,076::storage_connection::41::Storage.ServerConnection::(__validateConnectionParams) conList=[{'connection': 'orion.qa.lab.tlv.redhat.com:/export/mgoldboi/data1', 'iqn': '', 'portal': '', 'user': '', 'password': '******', 'id': '6e9e7a32-2565-4662-ab40-08a3a1c73b3d', 'port': ''}, {'connection': 'qanashead.qa.lab.tlv.redhat.com:/export/mgoldboi/data1', 'iqn': '', 'portal': '', 'user': '', 'password': '******', 'id': '3100cb1a-5ae9-49ee-8450-ea268e24531f', 'port': ''}] Thread-27::DEBUG::2011-02-23 16:09:15,081::fileUtils::109::Storage.Misc.excCmd::(umount) '/usr/bin/sudo -n /bin/umount -f /rhev/data-center/mnt/qanashead.qa.lab.tlv.redhat.com:_export_mgoldboi_data1' (cwd None) Thread-27::DEBUG::2011-02-23 16:09:41,139::fileUtils::109::Storage.Misc.excCmd::(umount) FAILED: <err> = 'umount2: Device or resource busy\numount.nfs: /rhev/data-center/mnt/qanashead.qa.lab.tlv.redhat.com:_export_mgoldboi_data1: device is busy\numount2: Device or resource busy\numount.nfs: /rhev/data-center/mnt/qanashead.qa.lab.tlv.redhat.com:_export_mgoldboi_data1: device is busy\n'; <rc> = 16 Thread-27::ERROR::2011-02-23 16:12:41,140::storage_connection::169::Storage.ServerConnection::(__connectFileServer) Error during storage connection: [Errno 17] File exists: '/rhev/data-center/mnt/qanashead.qa.lab.tlv.redhat.com:_export_mgoldboi_data1' Thread-27::DEBUG::2011-02-23 16:12:41,162::task::491::TaskManager.Task::(_debug) Task 3085df0c-08f2-459b-bc07-18ec7c811c51: finished: {'statuslist': [{'status': 0, 'id': '6e9e7a32-2565-4662-ab40-08a3a1c73b3d'}, {'status': 451, 'id': '3100cb1a-5ae9-49ee-8450-ea268e24531f'}]} Thread-27::DEBUG::2011-02-23 16:12:41,163::task::491::TaskManager.Task::(_debug) Task 3085df0c-08f2-459b-bc07-18ec7c811c51: moving from state preparing -> state finished Thread-27::DEBUG::2011-02-23 16:12:41,163::resourceManager::786::irs::(releaseAll) Owner.releaseAll requests {} resources {} Thread-27::DEBUG::2011-02-23 16:12:41,163::resourceManager::821::irs::(cancelAll) Owner.cancelAll requests {} Thread-27::DEBUG::2011-02-23 16:12:41,164::task::491::TaskManager.Task::(_debug) Task 3085df0c-08f2-459b-bc07-18ec7c811c51: ref 0 aborting False Thread-27::INFO::2011-02-23 16:12:41,164::dispatcher::100::Storage.Dispatcher.Protect::(run) Run and protect: connectStorageServer, Return response: {'status': {'message': 'OK', 'code': 0}, 'statuslist': [{'status': 0, 'id': '6e9e7a32-2565-4662-ab40-08a3a1c73b3d'}, {'status': 451, 'id': '3100cb1a-5ae9-49ee-8450-ea268e24531f'}]}
Moran, why did you mark this as a regression? Is this in 2.3 and not in 2.2?
in 2.2 the vdsm was stuck on this scenario (GIL), behaviour was worst actually, but it's still buggy now... anyhow- removing regression.
http://gerrit.usersys.redhat.com/#change,596
Checked on 4.9-81.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2011-1782.html