Description of problem: [RFE] block simultaneously running cluster upgrades There needs to be a lock such that only one cluster upgrade can ever run at a time per cluster. It could cause bad environmental consequences if multiple instances of the upgrade role were running simultaneously. Even if it somehow succeeded on the backend, the notification stream in the Admin Portal would be confusing. Version-Release number of selected component (if applicable): 4.3, master How reproducible: always Steps to Reproduce: 1. Run cluster upgrade 2. Quickly run cluster upgrade again on that same cluster before the previous upgrade finishes Actual results: You can run cluster upgrade again on that same cluster before the previous upgrade finishes Expected results: You should not be able to run cluster upgrade again on that same cluster before any existing upgrade on that cluster finishes. Additional info: The role can (and often should) be run manually and/or via Tower, so we need some shared lock, maybe at role or playbook level. Engine would also need to gracefully detect that failure-due-to-lock condition.
@Ondra and Martin, can you share your thoughts?
@Ondra and Martin, same comment as in https://bugzilla.redhat.com/show_bug.cgi?id=1664844. This is kind of a mostly infra team feature :) I have it on UX now and Scott can lead, but let's see how it progresses. If heavy role work is needed, perhaps Ondra can assist or lead.
Created attachment 1528289 [details] Demo of the confirmation dialog This attachment is a demo gif of the new spinner and "cluster_maintenance" warning dialog.
Created attachment 1528290 [details] Demo of the confirmation dialog (gif) This attachment is a demo gif of the new spinner and "cluster_maintenance" warning dialog.
See the demo for how the confirm dialog works. The cluster upgrade role is unchanged but the dialog should deter people from running the operation twice (assuming they did not uncheck the box change cluster to maintenance mode on the options step).
Ondra agreed to take over the backend part. Assigning to him for now.
I don't see any other reliable way how to prevent 2 simultanoeus cluster upgrade processes running on the same cluster other than: 1. Create a field in cluster entity indicating that cluster upgrade is currently running for the cluster 2. Each when cluster upgrade is going to be executed (either from Ansible playbook or UI), we will check to the backend if there is no other cluster upgrade process running on the same flow (if it is fail this new cluster upgrade process) 3. If not, mark the field to indicate cluster upgrade and continue with the cluster upgrade flow 4. Upon successfully finish or error, clear the field to allow additional cluster upgrades 5. We will need to provide "force" option in cluster upgrade role (which mean also in Ansible module and RESTAPI) to forcefully execute cluster upgrade, which may be required if for example Ansible process if forcefully killed. But usage of this parameter will be logged into audit log and additional issues Above change means change in database, RESTAPI, Ansible module and finally cluster-upgrade role, which is unlikely to happen till downstream GA, so targeting to 4.4 and let's hope we will be able to backport into some later 4.3.z
*** Bug 1686808 has been marked as a duplicate of this bug. ***
Note: See BZ1687645
Verified in ovirt-ansible-cluster-upgrade-1.1.13-1.el7ev.noarch ovirt-engine-4.3.3.1-0.1.el7.noarch
This bugzilla is included in oVirt 4.3.3 release, published on April 16th 2019. Since the problem described in this bug report should be resolved in oVirt 4.3.3 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.