Bug 1723794
Summary: | [downstream clone - 4.3.5] Live Merge hung in the volume deletion phase, leaving snapshot in a LOCKED state | ||
---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | RHV bug bot <rhv-bugzilla-bot> |
Component: | ovirt-engine | Assignee: | Eyal Shenitzky <eshenitz> |
Status: | CLOSED ERRATA | QA Contact: | Eyal Shenitzky <eshenitz> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.2.5 | CC: | aefrat, eshenitz, gveitmic, gwatson, lsvaty, mkalinin, Rhev-m-bugs, tnisan |
Target Milestone: | ovirt-4.3.5 | Keywords: | ZStream |
Target Release: | 4.3.5 | Flags: | lsvaty:
testing_plan_complete-
|
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | ovirt-engine-4.3.5.2 | Doc Type: | No Doc Update |
Doc Text: | Story Points: | --- | |
Clone Of: | 1637172 | Environment: | |
Last Closed: | 2019-08-12 11:53:28 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1637172 | ||
Bug Blocks: |
Description
RHV bug bot
2019-06-25 12:12:47 UTC
Hey Gordon, Did you manage to reproduce this bug? (Originally by Eyal Shenitzky) Hey Gordon, It is impossible to debug this log, the environment was in a huge mess and there were a lot of live merge attempts. Please try to reproduce this issue with clear steps. (Originally by Eyal Shenitzky) This bug has not been marked as blocker for oVirt 4.3.0. Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1. (Originally by Sandro Bonazzola) Reproducing this bug on a none development environment may be difficult. Steps to reproduce: 1. Create a VM with a disk 2. Create a snapshot for the VM that contains the disk 3. Run the VM 4. Delete the snapshot from step 2 5. While the snapshot is deleted (live merge) after the 'MERGE_STATUS' step, block the communication between the engine to the SPM, this should be done before the engine sends the SPM the command to start the deleteImage command. The expected result after the fix: The attempt to delete the volume will fail and a second attempt will take place. If the communication is still blocked, the second attempt will fail too -> snapshot deletion will fail, the snapshot will not remain on 'locked'. If the communication is OK, the second attempt will succeed -> snapshot deletion will succeed (Originally by Eyal Shenitzky) As we discussed in thread mail and is written in the previous comment this bug is almost impossible to verify in QE(none development environment). Eyal S. will verify this bug when it will be merged on master and a cherry-pick will be created. reassigning QA Contact to Eyal to verify on current target milestone(4.3.5) . (Originally by Avihai Efrat) Eyal, a kind reminder to please verify this bug :) Thank you for your help. Verified locally on my dev environment by throwing an exception. steps: 1) Create a VM with a disk 2) Create a snapshot for the VM 3) Run the VM 4) Add an Exception that will throw in - @Override protected void executeCommand() { getParameters().setEntityInfo(new EntityInfo(VdcObjectType.Disk, getParameters().getImageGroupId())); VDSReturnValue vdsReturnValue = null; try { throw new EngineException(); // vdsReturnValue = runVdsCommand(VDSCommandType.DestroyImage, createVDSParameters()); } catch (EngineException e) { log.error("Failed to delete image {}/{}", getParameters().getImageGroupId(), getParameters().getImageList().stream().findFirst().get(), e); if (!getParameters().isLiveMerge()) { throw e; } } .... 5) Remove the created snapshot -> DestroyImageCommand failed, live merge continuing to try until it will succeed. 6) Remove the added Exception -> DestroyImageCommand succeed, snapshot removed. Moving to verify Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2431 sync2jira sync2jira |