Bug 804389 - spm start fails due to several processes racing each other
spm start fails due to several processes racing each other
Status: CLOSED WORKSFORME
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm (Show other bugs)
3.1.0
x86_64 Linux
unspecified Severity high
: ---
: 3.3.0
Assigned To: Ayal Baron
storage
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-18 11:28 EDT by Haim
Modified: 2016-02-10 11:30 EST (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-02-24 07:17:19 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
vdsm log (1.30 MB, application/x-gzip)
2012-03-18 11:29 EDT, Haim
no flags Details

  None (edit)
Description Haim 2012-03-18 11:28:14 EDT
Description of problem:

description: 

spmStart process fails due to several spm protect racing each other and trying
to acquire same lock over storage domain.
also, master domain remains mounted on /rhev/data-center/ during that process. 

setup: 1 RHEL host, connected to 2 storage domains over iSCSI

flow: activate host - spmStart flow

mitigation: kill all spm protect processes and mount master domain manually.

attached vdsm log.
Comment 1 Haim 2012-03-18 11:29:37 EDT
Created attachment 570906 [details]
vdsm log
Comment 4 Dan Kenigsberg 2012-03-19 04:44:02 EDT
Haim, do you have any clue why it suddenly happened on a system of yours? Does it ever reproduce?
Comment 5 Haim 2012-03-19 05:06:04 EDT
(In reply to comment #4)
> Haim, do you have any clue why it suddenly happened on a system of yours? Does
> it ever reproduce?

Danken, this issue happened on QE production environment, on 2 different pools, which combines 3-5 hosts each. 
I had to manually kill spm-protect processes on some of the hosts, which helped and host manage to acquire spm. 

I think vdsm is not the only to blame here, and I suspect backend confuses, and send spmStart in parallel.
Comment 6 RHEL Product and Program Management 2012-05-04 00:08:33 EDT
Since RHEL 6.3 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.
Comment 8 RHEL Product and Program Management 2012-07-10 04:55:17 EDT
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.
Comment 9 RHEL Product and Program Management 2012-07-10 21:55:25 EDT
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.
Comment 10 RHEL Product and Program Management 2012-12-14 03:54:14 EST
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.
Comment 11 Ayal Baron 2013-02-16 15:27:26 EST
Haim, are you able to reproduce this?
Comment 12 Haim 2013-02-24 07:17:19 EST
no. closing as WORKSFORME.

Note You need to log in before you can comment on or make changes to this bug.