Bug 843511 - 3.1 - beta2 - failure in async task (delete vm disk - deleteImage) after connection to ISO domain was restored
3.1 - beta2 - failure in async task (delete vm disk - deleteImage) after conn...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: vdsm (Show other bugs)
6.3
x86_64 Linux
urgent Severity urgent
: beta
: ---
Assigned To: Saggi Mizrahi
Gadi Ickowicz
infra
: Regression, Reopened, TestBlocker
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-07-26 10:01 EDT by Haim
Modified: 2014-08-21 21:40 EDT (History)
13 users (show)

See Also:
Fixed In Version: vdsm-4.9.6-29.0.el6_3.x86_64
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-12-05 02:35:23 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
vdsm logs (1.24 MB, application/x-tar)
2012-07-26 10:03 EDT, Haim
no flags Details
vdsm and engine logs (773.72 KB, application/x-gzip)
2012-08-14 06:57 EDT, Gadi Ickowicz
no flags Details
vdsm + engine logs (612.45 KB, application/x-gzip)
2012-08-22 04:17 EDT, Gadi Ickowicz
no flags Details

  None (edit)
Description Haim 2012-07-26 10:01:17 EDT
Description of problem:

I have a case where I try to create template from VM and task is failing on the following error:

095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,641::lvm::393::OperationMutex::(_reloadlvs) Operation 'lvm reload operation' got the operation mutex
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,642::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /sbin/lvm lvs --config " devices { preferred_names = [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ \\"a%1qe-export_import31343141|3514f0c5d6620000c|3514f0c5d6620026f|3514f0c5d66200270|3514f0c5d66200271|3514f0c5d66200272|3514f0c5d66200273%\\", \\"r%.*%\\" ] }  global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {  retain_min = 50  retain_days = 0 } " --noheadings --units b --nosuffix --separator | -o uuid,name,vg_name,attr,size,seg_start_pe,devices,tags a3ceac0d-a435-4306-bd61-2465ef9525f9' (cwd None)
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,896::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = '  /dev/mapper/1qe-export_import31343141: read failed after 0 of 4096 at 53687025664: Input/output error\n  /dev/mapper/1qe-export_import31343141: read failed after 0 of 4096 at 53687083008: Input/output error\n  /dev/mapper/1qe-export_import31343141: read failed after 0 of 4096 at 0: Input/output error\n  WARNING: Error counts reached a limit of 3. Device /dev/mapper/1qe-export_import31343141 was disabled\n'; <rc> = 0
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,913::lvm::422::OperationMutex::(_reloadlvs) Operation 'lvm reload operation' released the operation mutex
095bf145-27e5-44a5-9ba7-96d85143a5f0::WARNING::2012-07-26 19:10:12,913::lvm::573::Storage.LVM::(getLv) lv: dcaf11d7-e425-4848-b092-6243ae91b58b not found in lvs vg: a3ceac0d-a435-4306-bd61-2465ef9525f9 response
095bf145-27e5-44a5-9ba7-96d85143a5f0::INFO::2012-07-26 19:10:12,914::image::94::Storage.Image::(create) Create placeholder /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/a3ceac0d-a435-4306-bd61-2465ef9525f9/images/c6ae1c85-6812-404b-a2f5-6221ed204c10 for image's volumes
095bf145-27e5-44a5-9ba7-96d85143a5f0::ERROR::2012-07-26 19:10:12,915::image::767::Storage.Image::(copy) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/image.py", line 751, in copy
    srcImgUUID=volume.BLANK_UUID, srcVolUUID=volume.BLANK_UUID)
  File "/usr/share/vdsm/storage/blockSD.py", line 518, in createVolume
    volUUID, desc, srcImgUUID, srcVolUUID)
  File "/usr/share/vdsm/storage/blockVolume.py", line 176, in create
    imageDir = image.Image(repoPath).create(sdUUID, imgUUID)
  File "/usr/share/vdsm/storage/image.py", line 97, in create
    [imageDir]))
  File "/usr/share/vdsm/storage/task.py", line 1049, in pushRecovery
    self.persist()
  File "/usr/share/vdsm/storage/task.py", line 1124, in persist
    self._save(self.store)
  File "/usr/share/vdsm/storage/task.py", line 738, in _save
    raise se.TaskDirError("_save: no such task dir '%s'" % origTaskDir)
TaskDirError: can't find/access task dir: ("_save: no such task dir '/rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0'",)
095bf145-27e5-44a5-9ba7-96d85143a5f0::INFO::2012-07-26 19:10:12,916::blockVolume::423::Storage.Volume::(teardown) Tearing down volume a3ceac0d-a435-4306-bd61-2465ef9525f9/ffe9e27e-dab6-4d78-af60-f5f9247110b8 justme False
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,917::resourceManager::538::ResourceManager::(releaseResource) Trying to release resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_lvmActivationNS.ffe9e27e-dab6-4d78-af60-f5f9247110b8'
095bf145-27e5-44a5-9ba7-96d85143a5f0::ERROR::2012-07-26 19:10:12,917::image::662::Storage.Image::(__cleanupCopy) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/image.py", line 658, in __cleanupCopy
    srcVol.teardown(sdUUID=srcVol.sdUUID, volUUID=srcVol.volUUID)
  File "/usr/share/vdsm/storage/blockVolume.py", line 425, in teardown
    rmanager.releaseResource(lvmActivationNamespace, volUUID)
  File "/usr/share/vdsm/storage/resourceManager.py", line 550, in releaseResource
    raise ValueError("Resource '%s.%s' is not currently registered" % (namespace, name))
ValueError: Resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_lvmActivationNS.ffe9e27e-dab6-4d78-af60-f5f9247110b8' is not currently registered
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,918::resourceManager::538::ResourceManager::(releaseResource) Trying to release resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_imageNS.c6ae1c85-6812-404b-a2f5-6221ed204c10'
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,919::resourceManager::553::ResourceManager::(releaseResource) Released resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_imageNS.c6ae1c85-6812-404b-a2f5-6221ed204c10' (0 active users)
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,919::resourceManager::558::ResourceManager::(releaseResource) Resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_imageNS.c6ae1c85-6812-404b-a2f5-6221ed204c10' is free, finding out if anyone is waiting for it.
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,919::resourceManager::565::ResourceManager::(releaseResource) No one is waiting for resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_imageNS.c6ae1c85-6812-404b-a2f5-6221ed204c10', Clearing records.
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,920::resourceManager::538::ResourceManager::(releaseResource) Trying to release resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_imageNS.504a4662-08be-4d1b-9aef-911d4ac7f134'
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,921::resourceManager::553::ResourceManager::(releaseResource) Released resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_imageNS.504a4662-08be-4d1b-9aef-911d4ac7f134' (0 active users)
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,921::resourceManager::558::ResourceManager::(releaseResource) Resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_imageNS.504a4662-08be-4d1b-9aef-911d4ac7f134' is free, finding out if anyone is waiting for it.
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,921::resourceManager::538::ResourceManager::(releaseResource) Trying to release resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_volumeNS.ffe9e27e-dab6-4d78-af60-f5f9247110b8'
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,922::resourceManager::553::ResourceManager::(releaseResource) Released resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_volumeNS.ffe9e27e-dab6-4d78-af60-f5f9247110b8' (0 active users)
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,922::resourceManager::558::ResourceManager::(releaseResource) Resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_volumeNS.ffe9e27e-dab6-4d78-af60-f5f9247110b8' is free, finding out if anyone is waiting for it.
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,923::resourceManager::565::ResourceManager::(releaseResource) No one is waiting for resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_volumeNS.ffe9e27e-dab6-4d78-af60-f5f9247110b8', Clearing records.
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,923::resourceManager::565::ResourceManager::(releaseResource) No one is waiting for resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_imageNS.504a4662-08be-4d1b-9aef-911d4ac7f134', Clearing records.
095bf145-27e5-44a5-9ba7-96d85143a5f0::ERROR::2012-07-26 19:10:12,924::task::853::TaskManager.Task::(_setError) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 320, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/securable.py", line 63, in wrapper
    return f(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1643, in copyImage
    volType, volFormat, preallocate, postZero, force)
  File "/usr/share/vdsm/storage/image.py", line 768, in copy
    raise se.CopyImageError("Destination volume %s error: %s" % (dstVolUUID, str(e)))
CopyImageError: low level Image copy failed: ('Destination volume dcaf11d7-e425-4848-b092-6243ae91b58b error: can\'t find/access task dir: ("_save: no such task dir \'/rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0\'",)',)
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,924::task::872::TaskManager.Task::(_run) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Task._run: 095bf145-27e5-44a5-9ba7-96d85143a5f0 () {} failed - stopping task
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,925::task::1199::TaskManager.Task::(stop) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::stopping in state running (force False)
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,925::task::978::TaskManager.Task::(_decref) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::ref 1 aborting True
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,926::task::903::TaskManager.Task::(_runJobs) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::aborting: Task is aborted: 'low level Image copy failed' - code 261
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,926::task::978::TaskManager.Task::(_decref) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::ref 0 aborting True
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,926::task::913::TaskManager.Task::(_doAbort) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Task._doAbort: force False
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,927::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,927::task::588::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::moving from state running -> state aborting
095bf145-27e5-44a5-9ba7-96d85143a5f0::WARNING::2012-07-26 19:10:12,928::task::597::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Task._updateState: failed persisting task 095bf145-27e5-44a5-9ba7-96d85143a5f0
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 595, in _updateState
    self.persist()
  File "/usr/share/vdsm/storage/task.py", line 1124, in persist
    self._save(self.store)
  File "/usr/share/vdsm/storage/task.py", line 738, in _save
    raise se.TaskDirError("_save: no such task dir '%s'" % origTaskDir)
TaskDirError: can't find/access task dir: ("_save: no such task dir '/rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0'",)
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,929::task::537::TaskManager.Task::(__state_aborting) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::_aborting: recover policy auto
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,929::task::588::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::moving from state aborting -> state racquiring
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,930::task::740::TaskManager.Task::(_save) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::_save: orig /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0 temp /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0.temp
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,935::task::588::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::moving from state racquiring -> state recovering
095bf145-27e5-44a5-9ba7-96d85143a5f0::WARNING::2012-07-26 19:10:12,936::task::597::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Task._updateState: failed persisting task 095bf145-27e5-44a5-9ba7-96d85143a5f0
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 595, in _updateState
    self.persist()
  File "/usr/share/vdsm/storage/task.py", line 1124, in persist
    self._save(self.store)
  File "/usr/share/vdsm/storage/task.py", line 738, in _save
    raise se.TaskDirError("_save: no such task dir '%s'" % origTaskDir)
TaskDirError: can't find/access task dir: ("_save: no such task dir '/rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0'",)
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,937::task::789::TaskManager.Task::(_recover) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::_recover
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,937::task::795::TaskManager.Task::(_recover) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::running recovery create image rollback: c6ae1c85-6812-404b-a2f5-6221ed204c10: Image->createImageRollback(/rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/a3ceac0d-a435-4306-bd61-2465ef9525f9/images/c6ae1c85-6812-404b-a2f5-6221ed204c10)
095bf145-27e5-44a5-9ba7-96d85143a5f0::INFO::2012-07-26 19:10:12,938::image::75::Storage.Image::(createImageRollback) createImageRollback: imageDir=/rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/a3ceac0d-a435-4306-bd61-2465ef9525f9/images/c6ae1c85-6812-404b-a2f5-6221ed204c10
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,938::task::795::TaskManager.Task::(_recover) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::running recovery None
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,939::task::776::TaskManager.Task::(_recoverDone) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Recover Done: state recovering
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,939::task::588::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::moving from state recovering -> state recovered
095bf145-27e5-44a5-9ba7-96d85143a5f0::WARNING::2012-07-26 19:10:12,940::task::597::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Task._updateState: failed persisting task 095bf145-27e5-44a5-9ba7-96d85143a5f0
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 595, in _updateState
    self.persist()
  File "/usr/share/vdsm/storage/task.py", line 1124, in persist
    self._save(self.store)
  File "/usr/share/vdsm/storage/task.py", line 738, in _save
    raise se.TaskDirError("_save: no such task dir '%s'" % origTaskDir)


reproduction:

case 1:

- create vm
- start vm 
- create snapshot (live)
- clone vm from snapshot
- create template from cloned vm

case 2:

- create vm
- start vm
- create snapshot (live)
- stop vm
- create template from vm

3b6ea46c-d161-40f5-a900-c45f22aa7089::DEBUG::2012-07-26 19:23:40,249::volume::827::Storage.Volume::(qemuConvert) (qemuConvert): COPY /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/a3ceac0d-a435-4306-bd6
1-2465ef9525f9/images/597d8a7f-f968-49d6-a6d0-34a6931f9f7a/fd9d9ab9-dc4e-4e9c-99ca-4fd8c970d2e6 (qcow2) to /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/a3ceac0d-a435-4306-bd61-2465ef9525f9/images/9be5
0d0c-4e39-445f-a4d5-aa8ec0f48859/2bb1495a-6985-4f2b-921a-e25e402a96d2 (qcow2) START
3b6ea46c-d161-40f5-a900-c45f22aa7089::DEBUG::2012-07-26 19:23:40,250::__init__::1164::Storage.Misc.excCmd::(_log) '/bin/nice -n 19 /usr/bin/ionice -c 2 -n 7 /usr/bin/qemu-img convert -t none -f qcow2 /rhev/data
-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/a3ceac0d-a435-4306-bd61-2465ef9525f9/images/597d8a7f-f968-49d6-a6d0-34a6931f9f7a/fd9d9ab9-dc4e-4e9c-99ca-4fd8c970d2e6 -O qcow2 /rhev/data-center/4c4eef8f-5a80-4ca0-b
a5a-0f810c89393d/a3ceac0d-a435-4306-bd61-2465ef9525f9/images/9be50d0c-4e39-445f-a4d5-aa8ec0f48859/2bb1495a-6985-4f2b-921a-e25e402a96d2' (cwd None)
3b6ea46c-d161-40f5-a900-c45f22aa7089::DEBUG::2012-07-26 19:23:40,257::task::740::TaskManager.Task::(_save) Task=`3b6ea46c-d161-40f5-a900-c45f22aa7089`::_save: orig /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f81
0c89393d/mastersd/master/tasks/3b6ea46c-d161-40f5-a900-c45f22aa7089 temp /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/3b6ea46c-d161-40f5-a900-c45f22aa7089.temp
3b6ea46c-d161-40f5-a900-c45f22aa7089::ERROR::2012-07-26 19:23:40,269::image::797::Storage.Image::(copy) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/image.py", line 782, in copy
    raise e
ActionStopped: Action was stopped: ()

3b6ea46c-d161-40f5-a900-c45f22aa7089::ERROR::2012-07-26 19:23:44,202::task::853::TaskManager.Task::(_setError) Task=`3b6ea46c-d161-40f5-a900-c45f22aa7089`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 320, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/securable.py", line 63, in wrapper
    return f(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1643, in copyImage
    volType, volFormat, preallocate, postZero, force)
  File "/usr/share/vdsm/storage/image.py", line 798, in copy
    raise se.CopyImageError("src image=%s, dst image=%s: msg=%s" % (srcImgUUID, dstImgUUID, str(e)))
CopyImageError: low level Image copy failed: ('src image=597d8a7f-f968-49d6-a6d0-34a6931f9f7a, dst image=9be50d0c-4e39-445f-a4d5-aa8ec0f48859: msg=Action was stopped: ()',)
3b6ea46c-d161-40f5-a900-c45f22aa7089::DEBUG::2012-07-26 19:23:44,204::task::872::TaskManager.Task::(_run) Task=`3b6ea46c-d161-40f5-a900-c45f22aa7089`::Task._run: 3b6ea46c-d161-40f5-a900-c45f22aa7089 () {} failed - stopping task
Comment 1 Haim 2012-07-26 10:03:04 EDT
Created attachment 600514 [details]
vdsm logs
Comment 4 Ayal Baron 2012-08-01 16:57:06 EDT
Haim, not sure it's related, but you clearly have a storage problem:

19:10:12,896::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = '  /dev/mapper/1qe-export_import31343141: read failed after 0 of 4096 at 53687025664: Input/output error\n  /dev/mapper/1qe-export_import31343141: read failed after 0 of 4096 at 53687083008: Input/output error\n  /dev/mapper/1qe-export_import31343141: read failed after 0 of 4096 at 0: Input/output error\n  WARNING: Error counts reached a limit of 3. Device /dev/mapper/1qe-export_import31343141 was disabled\n';
Comment 5 Ayal Baron 2012-08-01 17:15:53 EDT
This indeed looks like a storage issue,

Task file couldn't be saved on:
/rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks

Haim, can you try and reproduce on another (clean) host?
If I'm right then this is not a beta blocker (if a bug at all)

Thread-95144::DEBUG::2012-07-26 19:10:11,645::task::740::TaskManager.Task::(_save) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::_save: orig /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0 temp /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0.temp
Thread-95144::DEBUG::2012-07-26 19:10:11,649::taskManager::66::TaskManager::(scheduleJob) scheduled job copyImage for task 095bf145-27e5-44a5-9ba7-96d85143a5f0 
Thread-95144::INFO::2012-07-26 19:10:11,649::logUtils::39::dispatcher::(wrapper) Run and protect: copyImage, Return response: None
Thread-95144::DEBUG::2012-07-26 19:10:11,650::task::1167::TaskManager.Task::(prepare) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Prepare: 1 jobs exist, move to acquiring
Thread-95144::DEBUG::2012-07-26 19:10:11,650::task::588::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::moving from state preparing -> state acquiring
Thread-95144::WARNING::2012-07-26 19:10:11,651::task::597::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Task._updateState: failed persisting task 095bf145-27e5-44a5-9ba7-96d85143a5f0
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 595, in _updateState
    self.persist()
  File "/usr/share/vdsm/storage/task.py", line 1124, in persist
    self._save(self.store)
  File "/usr/share/vdsm/storage/task.py", line 738, in _save
    raise se.TaskDirError("_save: no such task dir '%s'" % origTaskDir)
TaskDirError: can't find/access task dir: ("_save: no such task dir '/rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0'",)
Comment 7 Haim 2012-08-14 06:47:23 EDT
Gadi managed to reproduce this issue with automatic testing utilizing the following flow:

- create iSCSI data domain
- attach and activate ISO domain 
- create a new VM 
- start a new VM 
- block connection to ISO domain until domain turns inactive 
- once domain becomes inactive sleep 90
- unblock connection
- sleep 180
- destroy vm (stop vm if running)
- remove vm command (which produces a deleteImage)
- new task is created 9d2b6071-8c24-4669-a7a0-549c8b92aaf7 
- operation is failing in vdsm (attached)
- task dir is accessible !!!

[root@cougar04 ~]# tree /rhev/data-center/480b8850-6eab-489d-9184-4ed6df866e1b/mastersd/master/tasks/
/rhev/data-center/480b8850-6eab-489d-9184-4ed6df866e1b/mastersd/master/tasks/
├── 2a261182-3ac4-4ae2-ada3-4da1db522060
└── 9d2b6071-8c24-4669-a7a0-549c8b92aaf7

2 directories, 0 files
[root@cougar04 ~]# ls -l /rhev/data-center/480b8850-6eab-489d-9184-4ed6df866e1b/mastersd/master/tasks/
total 8
drwxr-xr-x. 2 vdsm kvm 4096 Aug 14 12:52 2a261182-3ac4-4ae2-ada3-4da1db522060
drwxr-xr-x. 2 vdsm kvm 4096 Aug 14 12:16 9d2b6071-8c24-4669-a7a0-549c8b92aaf7

- delete vm succeeds after vdsm restart


Thread-865::DEBUG::2012-08-14 12:16:51,080::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /sbin/lvm lvremove --config " devices { preferred_names = [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 
write_cache_state=0 disable_after_error_count=3 filter = [ \\"a%3514f0c5695800001|3514f0c5695800002|3514f0c5695800003|3514f0c5695800004|3514f0c5695800005|3600144f06572cd0000005028a5790407|3600144f06572cd0000005028
d2990418|3600144f06572cd000000502908bd041d|3600144f06572cd000000502a11130432%\\", \\"r%.*%\\" ] }  global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {  retain_min = 50  retain_days = 0
 } " -f --autobackup n 65efce35-6970-4287-9c08-07a88d5edf4b/f3ad7783-38b9-40e5-b4ac-fd2377c2c69e' (cwd None)
Thread-865::DEBUG::2012-08-14 12:16:51,416::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = '  WARNING: This metadata update is NOT backed up\n'; <rc> = 0
Thread-865::DEBUG::2012-08-14 12:16:51,417::lvm::467::OperationMutex::(_invalidatevgs) Operation 'lvm invalidate operation' got the operation mutex
Thread-865::DEBUG::2012-08-14 12:16:51,418::lvm::469::OperationMutex::(_invalidatevgs) Operation 'lvm invalidate operation' released the operation mutex
Thread-865::ERROR::2012-08-14 12:16:51,419::task::853::TaskManager.Task::(_setError) Task=`9d2b6071-8c24-4669-a7a0-549c8b92aaf7`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1290, in deleteImage
    self._spmSchedule(spUUID, "deleteImage", lambda : True)
  File "/usr/share/vdsm/storage/hsm.py", line 671, in _spmSchedule
    self.taskMng.scheduleJob("spm", pool.tasksDir, vars.task, name, func, *args)
  File "/usr/share/vdsm/storage/taskManager.py", line 62, in scheduleJob
    task.setPersistence(store, cleanPolicy=TaskCleanType.manual)
  File "/usr/share/vdsm/storage/task.py", line 1099, in setPersistence
    self.persist()
  File "/usr/share/vdsm/storage/task.py", line 1124, in persist
    self._save(self.store)
  File "/usr/share/vdsm/storage/task.py", line 738, in _save
    raise se.TaskDirError("_save: no such task dir '%s'" % origTaskDir)
TaskDirError: can't find/access task dir: ("_save: no such task dir '/rhev/data-center/480b8850-6eab-489d-9184-4ed6df866e1b/mastersd/master/tasks/9d2b6071-8c24-4669-a7a0-549c8b92aaf7'",)
Comment 9 Gadi Ickowicz 2012-08-14 06:57:54 EDT
Created attachment 604262 [details]
vdsm and engine logs
Comment 14 Ayal Baron 2012-08-17 08:14:43 EDT
Patches merged:
Iff1c83a72b642141afad6142db49f98d39dc141f
Ibbc4a76d3aa9547fa8b4fc284b8bd43cd37a0dba
I711d0d4f627a1f42886a65305f93bcd97deccd3f
Comment 15 Gadi Ickowicz 2012-08-22 04:17:10 EDT
Created attachment 606165 [details]
vdsm + engine logs

The scenario I used before now succeeds on vdsm-4.9.6-29.0.el6_3.x86_64.

logs attached
Comment 16 Barak 2012-09-10 09:20:48 EDT
Per comment #14,

Moving to ON_QA as this scenario needs to be verified.
Comment 17 Gadi Ickowicz 2012-10-14 10:21:54 EDT
Verified - as per comment #15, and again on:

vdsm-4.9.6-37.0.el6_3.x86_64.

Note You need to log in before you can comment on or make changes to this bug.