Bug 804389

Summary: spm start fails due to several processes racing each other
Product: Red Hat Enterprise Virtualization Manager Reporter: Haim <hateya>
Component: vdsmAssignee: Ayal Baron <abaron>
Status: CLOSED WORKSFORME QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.1.0CC: abaron, amureini, bazulay, hateya, iheim, lpeer, mgoldboi, yeylon, ykaul
Target Milestone: ---   
Target Release: 3.3.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-24 12:17:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
vdsm log none

Description Haim 2012-03-18 15:28:14 UTC
Description of problem:

description: 

spmStart process fails due to several spm protect racing each other and trying
to acquire same lock over storage domain.
also, master domain remains mounted on /rhev/data-center/ during that process. 

setup: 1 RHEL host, connected to 2 storage domains over iSCSI

flow: activate host - spmStart flow

mitigation: kill all spm protect processes and mount master domain manually.

attached vdsm log.

Comment 1 Haim 2012-03-18 15:29:37 UTC
Created attachment 570906 [details]
vdsm log

Comment 4 Dan Kenigsberg 2012-03-19 08:44:02 UTC
Haim, do you have any clue why it suddenly happened on a system of yours? Does it ever reproduce?

Comment 5 Haim 2012-03-19 09:06:04 UTC
(In reply to comment #4)
> Haim, do you have any clue why it suddenly happened on a system of yours? Does
> it ever reproduce?

Danken, this issue happened on QE production environment, on 2 different pools, which combines 3-5 hosts each. 
I had to manually kill spm-protect processes on some of the hosts, which helped and host manage to acquire spm. 

I think vdsm is not the only to blame here, and I suspect backend confuses, and send spmStart in parallel.

Comment 6 RHEL Program Management 2012-05-04 04:08:33 UTC
Since RHEL 6.3 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 8 RHEL Program Management 2012-07-10 08:55:17 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 9 RHEL Program Management 2012-07-11 01:55:25 UTC
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.

Comment 10 RHEL Program Management 2012-12-14 08:54:14 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 11 Ayal Baron 2013-02-16 20:27:26 UTC
Haim, are you able to reproduce this?

Comment 12 Haim 2013-02-24 12:17:19 UTC
no. closing as WORKSFORME.