Bug 804389 - spm start fails due to several processes racing each other
Summary: spm start fails due to several processes racing each other
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.1.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 3.3.0
Assignee: Ayal Baron
QA Contact:
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-03-18 15:28 UTC by Haim
Modified: 2016-02-10 16:30 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-02-24 12:17:19 UTC
oVirt Team: Storage
Target Upstream Version:


Attachments (Terms of Use)
vdsm log (1.30 MB, application/x-gzip)
2012-03-18 15:29 UTC, Haim
no flags Details

Description Haim 2012-03-18 15:28:14 UTC
Description of problem:

description: 

spmStart process fails due to several spm protect racing each other and trying
to acquire same lock over storage domain.
also, master domain remains mounted on /rhev/data-center/ during that process. 

setup: 1 RHEL host, connected to 2 storage domains over iSCSI

flow: activate host - spmStart flow

mitigation: kill all spm protect processes and mount master domain manually.

attached vdsm log.

Comment 1 Haim 2012-03-18 15:29:37 UTC
Created attachment 570906 [details]
vdsm log

Comment 4 Dan Kenigsberg 2012-03-19 08:44:02 UTC
Haim, do you have any clue why it suddenly happened on a system of yours? Does it ever reproduce?

Comment 5 Haim 2012-03-19 09:06:04 UTC
(In reply to comment #4)
> Haim, do you have any clue why it suddenly happened on a system of yours? Does
> it ever reproduce?

Danken, this issue happened on QE production environment, on 2 different pools, which combines 3-5 hosts each. 
I had to manually kill spm-protect processes on some of the hosts, which helped and host manage to acquire spm. 

I think vdsm is not the only to blame here, and I suspect backend confuses, and send spmStart in parallel.

Comment 6 RHEL Program Management 2012-05-04 04:08:33 UTC
Since RHEL 6.3 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 8 RHEL Program Management 2012-07-10 08:55:17 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 9 RHEL Program Management 2012-07-11 01:55:25 UTC
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.

Comment 10 RHEL Program Management 2012-12-14 08:54:14 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 11 Ayal Baron 2013-02-16 20:27:26 UTC
Haim, are you able to reproduce this?

Comment 12 Haim 2013-02-24 12:17:19 UTC
no. closing as WORKSFORME.


Note You need to log in before you can comment on or make changes to this bug.