Bug 787722

Summary: VDSM: deadlock on SpmStart due to changes in lock release
Product: [Retired] oVirt Reporter: Dafna Ron <dron>
Component: vdsmAssignee: Dan Kenigsberg <danken>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: urgent Docs Contact:
Priority: urgent    
Version: unspecifiedCC: abaron, acathrow, bazulay, hateya, iheim, smizrahi, ykaul
Target Milestone: ---Keywords: Regression
Target Release: 3.1   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: v4.9.4-32-g5c2b4ae Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
logs none

Description Dafna Ron 2012-02-06 15:46:12 UTC
Description of problem:

we are getting a deadlock on SpmStart when SD is manually activated while SD is blocked from host (with iptables). 
because resource is not released host is stuck in contending and we are unable to squire SPM.  

Saggi checked and code change in commit 4d1dbcdd859be7a16ca0a276076b4a9f6ad5e7fa  "Fix lock release and reversed SPM logic"

self.lock should  be an RLOCK

Version-Release number of selected component (if applicable):

vdsm-4.9.3.2-0.fc16.x86_64

How reproducible:

100%

Steps to Reproduce:
1. create SD from 2 storage servers
2. block connectivity to 1 of the domains from host using iptables
3. after domain become inactive try to manually activate it
4. remove iptables block from host and try to activate domain again
  
Actual results:

deadlock in SpmStart, host will be stuck in contending and we cannot recover. 

Expected results:

we should be able to recover. 

Additional info: vdsm logs+ engine log

Comment 1 Saggi Mizrahi 2012-02-06 15:47:53 UTC
http://gerrit.ovirt.org/#change,1683

Comment 2 Dafna Ron 2012-02-06 15:49:44 UTC
Created attachment 559666 [details]
logs

Comment 3 Itamar Heim 2012-08-09 08:03:32 UTC
closing ON_QA bugs as oVirt 3.1 was released:
http://www.ovirt.org/get-ovirt/

Comment 4 Itamar Heim 2012-08-09 08:04:12 UTC
closing ON_QA bugs as oVirt 3.1 was released:
http://www.ovirt.org/get-ovirt/