Bug 689253 - VDSM: deleting snapshot during vdsmd restart causes task to fail -> trying to delete snapshot after failure will cause VM to get stuck on image locked
Summary: VDSM: deleting snapshot during vdsmd restart causes task to fail -> trying t...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: vdsm
Version: 6.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Igor Lvovsky
QA Contact: Dafna Ron
URL:
Whiteboard:
Depends On:
Blocks: 689221
TreeView+ depends on / blocked
 
Reported: 2011-03-20 15:30 UTC by Dafna Ron
Modified: 2013-03-01 04:53 UTC (History)
7 users (show)

Fixed In Version: vdsm-4.9-58.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-08-19 15:18:29 UTC
Target Upstream Version:


Attachments (Terms of Use)
logs (2.15 MB, text/plain)
2011-03-20 15:30 UTC, Dafna Ron
no flags Details
logs_fixed (2.15 MB, application/x-gzip)
2011-03-22 14:09 UTC, Dafna Ron
no flags Details

Description Dafna Ron 2011-03-20 15:30:07 UTC
Created attachment 486476 [details]
logs

Description of problem:

restarting vdsm while deleting snapshot will cause task to fail. 
if image was already deleted in storage than trying to delete snapshot again will cause VM to get stuck in image locked. 


Version-Release number of selected component (if applicable):

ic105

dsm-4.9-54.el6.x86_64
vdsm-debuginfo-4.9-51.el6.x86_64
vdsm-cli-4.9-54.el6.x86_64
vdsm-hook-vhostmd-4.9-53.el6.x86_64

qemu-kvm-0.12.1.2-2.146.el6.x86_64
qemu-img-0.12.1.2-2.146.el6.x86_64
gpxe-roms-qemu-0.9.7-6.4.el6.noarch

How reproducible:
100%

Steps to Reproduce:
1. create 3 snapshots
2. delete middle snapshot
3. wait 15-20 seconds - restart vdsmd
4. try to delete snapshot again
  
Actual results:

task will fail the first time. second time, the VM will get stuck on image locked. 

Expected results:

we should implement roll-forward. 
Bug for this was opened for backend - we also need the same from vdsm. 
bug 689250  was opened for RHEL5 on same issue with different results.  

Additional info: logs (the error from the weekend is in the attached log as well)

I left the VM over the weekend. eventually you will get error: 

Thread-24::DEBUG::2011-03-20 08:44:27,756::task::491::TaskManager.Task::(_debug) Task 2cfd1ac0-2734-43e7-a575-25581ae96274: moving from state init -> state preparing
Thread-24::ERROR::2011-03-20 08:44:27,931::spm::120::Storage.SPM.Secure::(run) SPM: spm method call rejected: Not SPM!!!  method: public_mergeSnapshots, called by: _run
Thread-24::ERROR::2011-03-20 08:44:27,931::task::854::TaskManager.Task::(_setError) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 862, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/spm.py", line 121, in run
    raise se.SpmStatusError(self.name)
SpmStatusError: Not SPM: ('public_mergeSnapshots',)

Comment 3 Dafna Ron 2011-03-22 14:09:14 UTC
Created attachment 486803 [details]
logs_fixed

attached logs again - fixed

Comment 4 Tomas Dosek 2011-04-11 08:52:10 UTC
verified - vdsm-4.9-58.el6 - vm no longer hangs in locked state, appropriate message is shown to user in rhevm, vdsmd service restarts smoothly and end successfully.


Note You need to log in before you can comment on or make changes to this bug.