Bug 882837 - PRD32 - engine - if connect storage pool fails on version mismatch, do reconstruct master
PRD32 - engine - if connect storage pool fails on version mismatch, do recons...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.1.0
Unspecified Unspecified
unspecified Severity unspecified
: ---
: 3.2.0
Assigned To: mkublin
Leonid Natapov
infra
: FutureFeature
Depends On:
Blocks: 869309 915537
  Show dependency treegraph
 
Reported: 2012-12-03 03:28 EST by Barak
Modified: 2016-02-10 14:16 EST (History)
11 users (show)

See Also:
Fixed In Version: sf3
Doc Type: Enhancement
Doc Text:
Previously, when hosts could not connect to the storage pool, the engine triggered the reconstruct master to increase the version number of the master domain, so the master domain can be used to synchronize between hosts and storage. However, the master domain version increase was not reflected on the host side, so the domain mismatch prevented hosts from connecting to the storage pool. Now, when the reconstruct is performed, the master domain version is increased on both the host and storage sides. When the reconstruct is successful, the hosts will connect to storage and return to an 'Up' state.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-06-10 17:25:40 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Barak 2012-12-03 03:28:11 EST
This is a part of the solution to resolve a series of issues related to "last host in up" and as a result of discussions done about Bug 869309.


in order for the above to happen correctly we should:
- serialize all calls to reconstruct master per pool
- first call to reconstruct will run to completion and will increment version on failure, all other calls in queue (reconstruct to the same pool) should fail immediately (no call to vdsm) hence no version increment is required.
- so when failing a connect storage pool on version mismatch (initVdsOnUp), we can safely send reconstruct master.
Comment 1 Barak 2012-12-03 03:29:12 EST
a thought - should we serialize all calls to SPM election as well ?
Comment 2 mkublin 2012-12-12 10:42:29 EST
http://gerrit.ovirt.org/#/c/9838/

These patch will introduce a queue for all events, if some reconstruct is running,
all other will be rejected.
Comment 3 mkublin 2012-12-16 05:43:20 EST
http://gerrit.ovirt.org/#/c/10103/
Comment 4 Ayal Baron 2012-12-17 01:44:25 EST
(In reply to comment #1)
> a thought - should we serialize all calls to SPM election as well ?

No question about it, there should only be 1 call/thread for spm election and a new call should not be sent before the previous one finished.

Same goes for connectStoragePool, refreshStoragePool, getSpmID, etc.
Comment 5 Itamar Heim 2012-12-17 03:28:08 EST
(In reply to comment #4)
> (In reply to comment #1)
> > a thought - should we serialize all calls to SPM election as well ?
> 
> No question about it, there should only be 1 call/thread for spm election

1 call/thread per storage pool i assume
Comment 6 Ayal Baron 2012-12-19 16:43:46 EST
(In reply to comment #5)
> (In reply to comment #4)
> > (In reply to comment #1)
> > > a thought - should we serialize all calls to SPM election as well ?
> > 
> > No question about it, there should only be 1 call/thread for spm election
> 
> 1 call/thread per storage pool i assume

correct
Comment 8 Leonid Natapov 2013-03-14 10:32:19 EDT
sf10.
Comment 9 Cheryn Tan 2013-04-03 02:52:41 EDT
This bug is currently attached to errata RHEA-2013:14491. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.

* Consequence: What happens when the bug presents.

* Fix: What was done to fix the bug.

* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes

Thanks in advance.
Comment 10 errata-xmlrpc 2013-06-10 17:25:40 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0888.html

Note You need to log in before you can comment on or make changes to this bug.