Bug 1990231 - Setting a host to maintenance shouldn't be blocked when having 'active' image transfer
Summary: Setting a host to maintenance shouldn't be blocked when having 'active' image...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.4.7
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ovirt-4.5.3
: ---
Assignee: Artiom Divak
QA Contact: Evelina Shames
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-05 05:58 UTC by Eyal Shenitzky
Modified: 2022-10-03 19:01 UTC (History)
6 users (show)

Fixed In Version: ovirt-engine-4.5.3.1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-10-03 19:01:06 UTC
oVirt Team: Storage
Embargoed:
pm-rhel: ovirt-4.5?
pm-rhel: planning_ack?
pm-rhel: devel_ack+
pm-rhel: testing_ack?


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github oVirt ovirt-engine pull 632 0 None open core: VDS Maintenance will wait for all disk transfers to happen and … 2022-09-05 05:53:47 UTC
Red Hat Issue Tracker RHV-42963 0 None None None 2021-09-21 11:25:02 UTC

Description Eyal Shenitzky 2021-08-05 05:58:01 UTC
Description of problem:

Currently, when setting a host to maintenance, and there is an active image transfer that runs on this host (with status different then paused/finished/failed), the operation is blocked until the transfer will be over or paused.

We should never block this operation, we have a PeparingForMaintenance state to specifically handle tasks that need to finish before the final Maintenance state is set. 

This needlessly blocks people from initiating maintenance requests.


Version-Release number of selected component (if applicable):
4.4.7

How reproducible:
100%

Steps to Reproduce:
1. Create a VM with a disk
2. Start downloading the disk using ImageIO
3. Set the host that handles the transfer to maintenance

Actual results:
Setting the host to maintenance is blocked with a warning that there is an image transfer process on it.

Expected results:
Setting host to maintenance should succeed with a proper warning for the user about waiting for the running image transfer to end before setting the host to maintenance and move the host state to PeparingForMaintenance state. 

Additional info:

Comment 1 Michal Skrivanek 2021-08-06 06:43:33 UTC
There are few validations for similar oongoing tasks (e.g. jobs), if possible that should be addressed as well

Comment 2 Nir Soffer 2021-08-06 16:30:16 UTC
Before we remove the validation, we must ensure that code handling preparing
for maintenance state is considering active image transfers. Otherwise the
host may be disconnected from storage while an image transfer is active.

Previously we did not need to handle this case because we had the validation.

Same issue for other ongoing tasks (comment 1) that are likely not handled yet.

Comment 4 Sandro Bonazzola 2022-03-29 16:16:40 UTC
We are past 4.5.0 feature freeze, please re-target.

Comment 5 Michal Skrivanek 2022-04-20 12:09:09 UTC
this is supposedly not that hard, and still quite useful.

Comment 6 Arik 2022-05-24 14:47:37 UTC
To fix this we need to make two changes:
1. In VirtMonitoringStrategy#canMoveToMaintenance we also need to check if there's an ongoing transfer on the host
2. In TransferDiskImage we need to make sure not to start a transfer on a host that is in PreparingToMaintenance status

Comment 7 Nir Soffer 2022-05-24 14:55:29 UTC
(In reply to Arik from comment #6)
> To fix this we need to make two changes:
> 1. In VirtMonitoringStrategy#canMoveToMaintenance we also need to check if
> there's an ongoing transfer on the host
> 2. In TransferDiskImage we need to make sure not to start a transfer on a
> host that is in PreparingToMaintenance status

Not starting a transfer on host in PreparingToMaintenance sounds like nice improvement
but it is not enough.

If there are already active transfers we need to either wait for them or cancel them,
but currently cancelling image transfer is flaky and likely to end in stuck transfer when
the user cancel the transfer after failure (same issue we had in backup).

Comment 8 Arik 2022-05-25 06:53:40 UTC
(In reply to Nir Soffer from comment #7)
> (In reply to Arik from comment #6)
> > To fix this we need to make two changes:
> > 1. In VirtMonitoringStrategy#canMoveToMaintenance we also need to check if
> > there's an ongoing transfer on the host
> > 2. In TransferDiskImage we need to make sure not to start a transfer on a
> > host that is in PreparingToMaintenance status
> 
> Not starting a transfer on host in PreparingToMaintenance sounds like nice
> improvement
> but it is not enough.
> 
> If there are already active transfers we need to either wait for them or
> cancel them,
> but currently cancelling image transfer is flaky and likely to end in stuck
> transfer when
> the user cancel the transfer after failure (same issue we had in backup).

Right, that's covered by #1 above - just like we do it for VMs that run on the host

Comment 9 Shir Fishbain 2022-05-30 19:52:36 UTC
QE doesn't have the capacity to verify during 4.5.1

Comment 10 Casper (RHV QE bot) 2022-10-03 19:01:06 UTC
This bug has low overall severity and passed an automated regression suite, and is not going to be further verified by QE. If you believe special care is required, feel free to re-open to ON_QA status.


Note You need to log in before you can comment on or make changes to this bug.