Bug 809368 - [ovirt] [vdsm] deadlock on SPM stop command
Summary: [ovirt] [vdsm] deadlock on SPM stop command
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.3.4
Assignee: Dan Kenigsberg
QA Contact:
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-04-03 08:19 UTC by Haim
Modified: 2016-02-10 16:28 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-03-12 09:36:44 UTC
oVirt Team: Storage
Embargoed:


Attachments (Terms of Use)
deadlock (478.08 KB, application/x-tar)
2012-04-03 08:22 UTC, Haim
no flags Details

Description Haim 2012-04-03 08:19:48 UTC
Description of problem:

I seem to hit a deadlock on vdsm after spmStop task fails on resource time-out.
I noticed that after this failure, host fails to handle connectStorageServer calls from backend, meaning, connectStorageServer is sent, iscsi login is performed,
but command seem to stuck on _invalidateAllPvs, and vdsm doesn't return valid response to backend. 

Thread-277736:EBUG::2012-04-01 05:16:07,103::lvm::457::OperationMutex:_invalidateAllPvs) Operation 'lvm reload operation' is holding the operation mutex, waiting...
Thread-277738:EBUG::2012-04-01 05:16:07,826::BindingXMLRPC::167::vds:wrapper) [10.35.97.3

all problems started after failure in 'Thread-277609'.

in gdb, there are lots of threads waiting in cond.wait().

attached full gdb trace of all running threads, and vdsm log.

please note that it happened while connection connection to NFS server hosting 2 mount points was blocked.

Comment 1 Haim 2012-04-03 08:22:13 UTC
Created attachment 574784 [details]
deadlock

Comment 2 Itamar Heim 2013-03-12 09:36:44 UTC
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.


Note You need to log in before you can comment on or make changes to this bug.