Bug 1019803

Summary: [VDSM] ConnectStoragePool fails with: StoragePoolWrongMaster: Wrong Master domain or its version
Product: Red Hat Enterprise Virtualization Manager Reporter: Raz Tamir <ratamir>
Component: vdsmAssignee: Allon Mureinik <amureini>
Status: CLOSED WONTFIX QA Contact: Aharon Canan <acanan>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.3.0CC: abaron, acanan, amureini, bazulay, hateya, iheim, laravot, lpeer, ukar, yeylon
Target Milestone: ---Keywords: Triaged
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-09 20:29:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
vdsm log
none
engine log none

Description Raz Tamir 2013-10-16 12:46:02 UTC
Created attachment 812902 [details]
vdsm log

Description of problem:

When trying to remove a DC, after maintenance the storage domain attached to it , no error message displayed but in the events log (web portal) there is a tuple that syas "Failed to remove Data Center". The situation is that you can not:
remove the DC, activate the attached domain.
also, the DC's status is : non responsive.

from the vdsm.log:
StoragePoolWrongMaster: Wrong Master domain or its version


Version-Release number of selected component (if applicable):


How reproducible:
rhevm-3.3.0-0.24.master.el6ev.noarch

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Raz Tamir 2013-10-16 12:47:23 UTC
Created attachment 812903 [details]
engine log

Comment 2 Maor 2013-12-25 14:04:35 UTC
From the log it seems that DeactivateStorageDomain was called several time.
DeactivateStorageDomain increases the master version and later send it to VDSM.
Recently there was a fix in that area of code introduced in commit 87902a0a08a4ceffcb0bdb9d45ec4a01f8b86a7e which I believe might fix this issue, but since this bug does not reproduce consistently and there are no reproduce steps, I can't be sure.

Liron, what do you think? Is your patch might fix that?

Comment 3 Ayal Baron 2013-12-26 09:39:21 UTC
Aharon, reproduction steps?

Comment 4 Raz Tamir 2013-12-29 13:44:50 UTC
No actual steps were taken that cause this bug.
also tried to debug this issue to figure out what cause the problem but i didn't see anything unusual