Bug 995501 - [host-deploy] block concurrent installation for same host
[host-deploy] block concurrent installation for same host
Status: CLOSED ERRATA
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.2.0
Unspecified Unspecified
unspecified Severity unspecified
: ---
: 3.3.0
Assigned To: Alon Bar-Lev
Tareq Alayan
infra
: ZStream
Depends On:
Blocks: 996854
  Show dependency treegraph
 
Reported: 2013-08-09 10:34 EDT by Michael Everette
Modified: 2016-02-10 14:16 EST (History)
12 users (show)

See Also:
Fixed In Version: is10
Doc Type: Bug Fix
Doc Text:
Previously, it was possible for separate users to attempt the installation, upgrade or approval of a single virtualization host at the same time, or for a single user to to fire such events multiple times by clicking a button more than once in succession. This would result in the same action being performed concurrently on the virtualization host, causing conflicts in processing. This feature locks hosts during the installation, upgrade and approval of virtualization hosts, preventing these actions from being run concurrently.
Story Points: ---
Clone Of:
: 996854 (view as bug list)
Environment:
Last Closed: 2014-01-21 12:35:38 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
upgrade part from /var/log/ovirt-engine/engine.log (16.10 KB, text/plain)
2013-08-09 10:34 EDT, Michael Everette
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 448923 None None None Never
oVirt gerrit 17892 None None None Never
oVirt gerrit 17928 None None None Never

  None (edit)
Description Michael Everette 2013-08-09 10:34:47 EDT
Created attachment 784865 [details]
upgrade part from /var/log/ovirt-engine/engine.log

Description of problem:

When upgrading one of the hypervisors. Customer clicked the ok button twice by accident. It started upgrading but the log shows that the all update actions are executed twice. A reboot failed with kernel panic. I fixed this by booting the backup kernel and running the upgrade again.


How reproducible:

Reproduced on test setup

Steps to Reproduce:
1. Start upgrade process
2. Click OK button twice in a row

Actual results:

A reboot failed with kernel panic

Expected results:

Upgrade process should be protected against multiple instances
Comment 1 Alon Bar-Lev 2013-08-10 13:56:56 EDT
Probably this bug always existed... 

I wanted to avoid touching the complex scheme of commands during the host-deploy re-write, but it looks like we should attend to this too.

I have no clue how to avoid parallel run of commands relates to one object. I have copied something from other commands, but it is only a guess.

Yair, can you please guide me?

Thanks,
Comment 2 Yair Zaslavsky 2013-08-11 02:38:13 EDT
No problem.
There are several means of protection that should be considered here -
a. canDoAction
b. Using the in memory lock mechanism
c. Failure within the executeAction.
I need to understand exactly the flow more, feel free to contact me so we can come up with a plan.
Comment 3 Alon Bar-Lev 2013-08-11 02:42:27 EDT
(In reply to Yair Zaslavsky from comment #2)
> No problem.
> There are several means of protection that should be considered here -
> a. canDoAction
> b. Using the in memory lock mechanism
> c. Failure within the executeAction.
> I need to understand exactly the flow more, feel free to contact me so we
> can come up with a plan.

Hi,

Please review the change...

a. canDoAction cannot be used as we do not have atomic set status.

b. in memory lock will lock all installation, and we do need parallel installation on different hosts.

c. executeAction is probably the right one, but I have no idea of to use it properly.


What we need is that InsallVdsCommand will be run once per host and fail otherwise.

Thanks!
Comment 5 Alon Bar-Lev 2013-08-11 09:29:29 EDT
David,

A bug that is modified was moved to post because of automation.

Not sure I understand why.

Thanks,
Alon
Comment 6 Itamar Heim 2013-08-11 10:31:53 EDT
(In reply to Alon Bar-Lev from comment #5)
> David,
> 
> A bug that is modified was moved to post because of automation.
> 
> Not sure I understand why.
> 
> Thanks,
> Alon

because a new patch lining to this bug was posted in gerrit?
http://gerrit.ovirt.org/#/c/17928/
Comment 7 Alon Bar-Lev 2013-08-11 10:37:11 EDT
(In reply to Itamar Heim from comment #6)
> (In reply to Alon Bar-Lev from comment #5)
> > David,
> > 
> > A bug that is modified was moved to post because of automation.
> > 
> > Not sure I understand why.
> > 
> > Thanks,
> > Alon
> 
> because a new patch lining to this bug was posted in gerrit?
> http://gerrit.ovirt.org/#/c/17928/

this is for stable upstream. has nothing to do with this downstream product.

we once again mix up between upstream development and downstream, we abuse bugzilla of downstream to manage future development of upstream.

we add over that we add automation which cannot win because we manage the project incorrectly.
Comment 13 Alon Bar-Lev 2013-08-13 05:19:23 EDT
Again... too much automation.
Comment 15 Alon Bar-Lev 2013-08-14 06:53:29 EDT
Note to QA:

This change effects all:
1. Node/Host installation.
2. Host re-installation.
3. Node approval.
4. Node upgrade.

The fix enforces no parallel operations can run on the same host.

However, please also check that this fix do not effect parallelism of different hosts.
Comment 16 Charlie 2013-11-27 19:21:51 EST
This bug is currently attached to errata RHEA-2013:15231. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.
Comment 17 Tareq Alayan 2013-12-31 04:00:13 EST
verified
ovirt-host-deploy-1.1.3-1.el6ev.noarch
rhevm-3.3.0-0.42.el6ev.noarch
Comment 18 errata-xmlrpc 2014-01-21 12:35:38 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2014-0038.html

Note You need to log in before you can comment on or make changes to this bug.