Bug 995501

Summary: [host-deploy] block concurrent installation for same host
Product: Red Hat Enterprise Virtualization Manager Reporter: Michael Everette <meverett>
Component: ovirt-engineAssignee: Alon Bar-Lev <alonbl>
Status: CLOSED ERRATA QA Contact: Tareq Alayan <talayan>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.2.0CC: acathrow, adahms, alonbl, bazulay, cpelland, dcaroest, iheim, lpeer, pstehlik, Rhev-m-bugs, yeylon, yzaslavs
Target Milestone: ---Keywords: ZStream
Target Release: 3.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: is10 Doc Type: Bug Fix
Doc Text:
Previously, it was possible for separate users to attempt the installation, upgrade or approval of a single virtualization host at the same time, or for a single user to to fire such events multiple times by clicking a button more than once in succession. This would result in the same action being performed concurrently on the virtualization host, causing conflicts in processing. This feature locks hosts during the installation, upgrade and approval of virtualization hosts, preventing these actions from being run concurrently.
Story Points: ---
Clone Of:
: 996854 (view as bug list) Environment:
Last Closed: 2014-01-21 17:35:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 996854    
Attachments:
Description Flags
upgrade part from /var/log/ovirt-engine/engine.log none

Description Michael Everette 2013-08-09 14:34:47 UTC
Created attachment 784865 [details]
upgrade part from /var/log/ovirt-engine/engine.log

Description of problem:

When upgrading one of the hypervisors. Customer clicked the ok button twice by accident. It started upgrading but the log shows that the all update actions are executed twice. A reboot failed with kernel panic. I fixed this by booting the backup kernel and running the upgrade again.


How reproducible:

Reproduced on test setup

Steps to Reproduce:
1. Start upgrade process
2. Click OK button twice in a row

Actual results:

A reboot failed with kernel panic

Expected results:

Upgrade process should be protected against multiple instances

Comment 1 Alon Bar-Lev 2013-08-10 17:56:56 UTC
Probably this bug always existed... 

I wanted to avoid touching the complex scheme of commands during the host-deploy re-write, but it looks like we should attend to this too.

I have no clue how to avoid parallel run of commands relates to one object. I have copied something from other commands, but it is only a guess.

Yair, can you please guide me?

Thanks,

Comment 2 Yair Zaslavsky 2013-08-11 06:38:13 UTC
No problem.
There are several means of protection that should be considered here -
a. canDoAction
b. Using the in memory lock mechanism
c. Failure within the executeAction.
I need to understand exactly the flow more, feel free to contact me so we can come up with a plan.

Comment 3 Alon Bar-Lev 2013-08-11 06:42:27 UTC
(In reply to Yair Zaslavsky from comment #2)
> No problem.
> There are several means of protection that should be considered here -
> a. canDoAction
> b. Using the in memory lock mechanism
> c. Failure within the executeAction.
> I need to understand exactly the flow more, feel free to contact me so we
> can come up with a plan.

Hi,

Please review the change...

a. canDoAction cannot be used as we do not have atomic set status.

b. in memory lock will lock all installation, and we do need parallel installation on different hosts.

c. executeAction is probably the right one, but I have no idea of to use it properly.


What we need is that InsallVdsCommand will be run once per host and fail otherwise.

Thanks!

Comment 5 Alon Bar-Lev 2013-08-11 13:29:29 UTC
David,

A bug that is modified was moved to post because of automation.

Not sure I understand why.

Thanks,
Alon

Comment 6 Itamar Heim 2013-08-11 14:31:53 UTC
(In reply to Alon Bar-Lev from comment #5)
> David,
> 
> A bug that is modified was moved to post because of automation.
> 
> Not sure I understand why.
> 
> Thanks,
> Alon

because a new patch lining to this bug was posted in gerrit?
http://gerrit.ovirt.org/#/c/17928/

Comment 7 Alon Bar-Lev 2013-08-11 14:37:11 UTC
(In reply to Itamar Heim from comment #6)
> (In reply to Alon Bar-Lev from comment #5)
> > David,
> > 
> > A bug that is modified was moved to post because of automation.
> > 
> > Not sure I understand why.
> > 
> > Thanks,
> > Alon
> 
> because a new patch lining to this bug was posted in gerrit?
> http://gerrit.ovirt.org/#/c/17928/

this is for stable upstream. has nothing to do with this downstream product.

we once again mix up between upstream development and downstream, we abuse bugzilla of downstream to manage future development of upstream.

we add over that we add automation which cannot win because we manage the project incorrectly.

Comment 13 Alon Bar-Lev 2013-08-13 09:19:23 UTC
Again... too much automation.

Comment 15 Alon Bar-Lev 2013-08-14 10:53:29 UTC
Note to QA:

This change effects all:
1. Node/Host installation.
2. Host re-installation.
3. Node approval.
4. Node upgrade.

The fix enforces no parallel operations can run on the same host.

However, please also check that this fix do not effect parallelism of different hosts.

Comment 16 Charlie 2013-11-28 00:21:51 UTC
This bug is currently attached to errata RHEA-2013:15231. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.

Comment 17 Tareq Alayan 2013-12-31 09:00:13 UTC
verified
ovirt-host-deploy-1.1.3-1.el6ev.noarch
rhevm-3.3.0-0.42.el6ev.noarch

Comment 18 errata-xmlrpc 2014-01-21 17:35:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2014-0038.html