Description of problem: Currently, when setting a host to maintenance, and there is an active image transfer that runs on this host (with status different then paused/finished/failed), the operation is blocked until the transfer will be over or paused. We should never block this operation, we have a PeparingForMaintenance state to specifically handle tasks that need to finish before the final Maintenance state is set. This needlessly blocks people from initiating maintenance requests. Version-Release number of selected component (if applicable): 4.4.7 How reproducible: 100% Steps to Reproduce: 1. Create a VM with a disk 2. Start downloading the disk using ImageIO 3. Set the host that handles the transfer to maintenance Actual results: Setting the host to maintenance is blocked with a warning that there is an image transfer process on it. Expected results: Setting host to maintenance should succeed with a proper warning for the user about waiting for the running image transfer to end before setting the host to maintenance and move the host state to PeparingForMaintenance state. Additional info:
There are few validations for similar oongoing tasks (e.g. jobs), if possible that should be addressed as well
Before we remove the validation, we must ensure that code handling preparing for maintenance state is considering active image transfers. Otherwise the host may be disconnected from storage while an image transfer is active. Previously we did not need to handle this case because we had the validation. Same issue for other ongoing tasks (comment 1) that are likely not handled yet.
We are past 4.5.0 feature freeze, please re-target.
this is supposedly not that hard, and still quite useful.
To fix this we need to make two changes: 1. In VirtMonitoringStrategy#canMoveToMaintenance we also need to check if there's an ongoing transfer on the host 2. In TransferDiskImage we need to make sure not to start a transfer on a host that is in PreparingToMaintenance status
(In reply to Arik from comment #6) > To fix this we need to make two changes: > 1. In VirtMonitoringStrategy#canMoveToMaintenance we also need to check if > there's an ongoing transfer on the host > 2. In TransferDiskImage we need to make sure not to start a transfer on a > host that is in PreparingToMaintenance status Not starting a transfer on host in PreparingToMaintenance sounds like nice improvement but it is not enough. If there are already active transfers we need to either wait for them or cancel them, but currently cancelling image transfer is flaky and likely to end in stuck transfer when the user cancel the transfer after failure (same issue we had in backup).
(In reply to Nir Soffer from comment #7) > (In reply to Arik from comment #6) > > To fix this we need to make two changes: > > 1. In VirtMonitoringStrategy#canMoveToMaintenance we also need to check if > > there's an ongoing transfer on the host > > 2. In TransferDiskImage we need to make sure not to start a transfer on a > > host that is in PreparingToMaintenance status > > Not starting a transfer on host in PreparingToMaintenance sounds like nice > improvement > but it is not enough. > > If there are already active transfers we need to either wait for them or > cancel them, > but currently cancelling image transfer is flaky and likely to end in stuck > transfer when > the user cancel the transfer after failure (same issue we had in backup). Right, that's covered by #1 above - just like we do it for VMs that run on the host
QE doesn't have the capacity to verify during 4.5.1
This bug has low overall severity and passed an automated regression suite, and is not going to be further verified by QE. If you believe special care is required, feel free to re-open to ON_QA status.