Bug 1538840

Summary: Disk move between storage domain's result's in source image being removed.
Product: Red Hat Enterprise Virtualization Manager Reporter: Ribu Tho <rabraham>
Component: vdsmAssignee: Fred Rolland <frolland>
Status: CLOSED ERRATA QA Contact: Elad <ebenahar>
Severity: high Docs Contact:
Priority: high    
Version: 4.1.8CC: bcholler, ebenahar, frolland, gveitmic, lsurette, lveyde, mkalinin, rabraham, ratamir, srevivo, tnisan, ycui, ykaul, ylavi
Target Milestone: ovirt-4.2.2Keywords: ZStream
Target Release: ---Flags: lsvaty: testing_plan_complete-
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: vdsm v4.20.18 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1544240 (view as bug list) Environment:
Last Closed: 2018-05-15 17:54:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1540310, 1544240    

Description Ribu Tho 2018-01-26 00:10:56 UTC
Description of problem:

A disk move between storage domains has resulted in the image being removed from the source SD before doing the copy/convert(qemu-img) operation to the destination host. 

Version-Release number of selected component (if applicable):

ovirt-engine-4.1.8.2-0.1.el7.noarch
vdsm-4.19.43-3.el7ev.x86_64
FC Block storage 

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:

- The disk image was removed from source for which VM is missing disk and unable to boot.

Expected results:


- The disk image to be copied to the destination SD and then remove from source. 

Additional info:

Comment 6 Fred Rolland 2018-02-05 15:30:45 UTC
From the log it seems that the move command was run twice with the same parameters.
It is not possible to perform with the REST API.
Though the user mentioned he might have click on the UI dialogue twice.

I will try to reproduce on 4.1.

Comment 10 Fred Rolland 2018-02-06 10:10:37 UTC
I can reproduce in 4.1.
Though using the debugger for update the command parameter of the source storage domain, not possible to do with REST API.

The move operation was performed twice with the same parameters from the UI.
I don't know how it happened, but the reproduction shows the exact same flow in the logs.

I will try to understand now how this flow affected the volume.

Comment 11 Fred Rolland 2018-02-06 10:57:27 UTC
This is the flow of this is issue:

User move disk A from SD_src to SD_dst via UI.
Move succeeded. Disk A is now on SD_dst, the disk on SD_src is deleted.

Again,(by mistake/UI issue) User move disk A from SD_src to SD_dst via UI.
The engine tries to create a volum on SD_dst but fails, because it is already existing there.
Then the MoveImageGroupCommand execute endWithFailure that deletes the disk in SD_dst which was the current disk copy. ( I guess trying to cleanup the failure).
Now the disk is gone...

Now, when the VM tries to start, it fails to find the volume, because it was deleted.


I don't know what issue happened in the UI.
The user mentioned in the case:
"I might have click twice on OK button of the move dialogue, since it didn't disappear when I click the first time."

The one thing I can think of is to add some validation in the Move command to avoid thi flow again.

Note that in 4.2, the command fails before trying to create the destination disk so it is cannot be reproduce there.

Comment 12 Fred Rolland 2018-02-06 13:21:25 UTC
In 4.2 actually , it will also delete the disk on failure.
Should be solved also there.

Comment 13 Allon Mureinik 2018-02-07 16:33:32 UTC
*** Bug 1540351 has been marked as a duplicate of this bug. ***

Comment 15 RHV bug bot 2018-02-16 16:24:57 UTC
INFO: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Project 'ovirt-engine'/Component 'vdsm' mismatch]

For more info please contact: rhv-devops

Comment 18 Kevin Alon Goldblatt 2018-02-27 12:30:39 UTC
Verified with the following code:
---------------------------------------
ovirt-engine-4.2.2.1-0.1.el7.noarch
vdsm-4.20.19-1.el7ev.x86_64

Verified with the following scenario:
---------------------------------------
1. Create a VM with disks
2. Move disk to another domain >>>>> When pressing OK the Move Disk dialogue closes immediately and does not allow pressing OK twice.


Moving to VERIFIED

Comment 19 RHV bug bot 2018-03-16 15:03:48 UTC
INFO: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Project 'ovirt-engine'/Component 'vdsm' mismatch]

For more info please contact: rhv-devops

Comment 22 Tal Nisan 2018-05-03 10:47:55 UTC
*** Bug 1574346 has been marked as a duplicate of this bug. ***

Comment 25 errata-xmlrpc 2018-05-15 17:54:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:1489

Comment 26 Franta Kust 2019-05-16 13:07:07 UTC
BZ<2>Jira Resync

Comment 27 Daniel Gur 2019-08-28 13:13:53 UTC
sync2jira

Comment 28 Daniel Gur 2019-08-28 13:18:07 UTC
sync2jira