Bug 689250

Summary: VDSM: deleting snapshot during vdsmd restart causes task to fail -> trying to delete snapshot after failure will cause ERROR: "Image is not a legal chain"
Product: Red Hat Enterprise Linux 5 Reporter: Dafna Ron <dron>
Component: vdsm22Assignee: Igor Lvovsky <ilvovsky>
Status: CLOSED WONTFIX QA Contact: yeylon <yeylon>
Severity: high Docs Contact:
Priority: high    
Version: 5.6CC: abaron, bazulay, danken, iheim, lpeer, srevivo, ykaul
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
If vdsm is restarted while deleting a snapshot, that snapshot may become unusable and undeletable.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-04-10 07:33:52 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
logs none

Description Dafna Ron 2011-03-20 15:00:00 UTC
Created attachment 486474 [details]
logs

Description of problem:

restarting vdsm during snapshot delete will cause task to fail. 
if Image has already been deleted in the vds than we are unable to delete/preview 

I tried this with 3 snapshots - deleted the middle one - this will cause the all chain to be unusable.  


Version-Release number of selected component (if applicable):

ic105

How reproducible:
100%

Steps to Reproduce:
1. delete middle snapshot from 3 snapshot's
2. wait until the delete starts (about 10 seconds) 
3. restart vdsmd 
  
Actual results:

The results of this test depends on the phase that the task failed at for vdsm. 
if the image has not yet been deleted than the failed task will have no effect except that for the failure of delete itself. 
if the image has already been deleted than we are unable to delete snapshot and preview will also fail with error in backend: wrong Vm Snapshot.   

Expected results:

backend will implement roll-forward for snapshot delete failure- we need the same from vdsm. 

Additional info: logs attached from SPM (source) and destination (HSM)

Comment 3 Dan Kenigsberg 2011-03-31 12:28:05 UTC
backporting the fix of bug 689253 to rhev-2.2.Z is far from trivial, and I do not think it is worth the trouble.

Can the customer admin mitigate the problem after it happened? How?

Comment 4 Dan Kenigsberg 2011-03-31 12:28:05 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
I vdsm is restarted while deleting a snapshot, that snapshot may become unusable and undeletable.

Comment 6 Dan Kenigsberg 2011-04-10 07:33:52 UTC
I'm afraid backporting is too risky and resource-consuming.

Comment 7 Dan Kenigsberg 2011-04-10 07:33:52 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-I vdsm is restarted while deleting a snapshot, that snapshot may become unusable and undeletable.+If vdsm is restarted while deleting a snapshot, that snapshot may become unusable and undeletable.