Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 843511

Summary: 3.1 - beta2 - failure in async task (delete vm disk - deleteImage) after connection to ISO domain was restored
Product: Red Hat Enterprise Linux 6 Reporter: Haim <hateya>
Component: vdsmAssignee: Saggi Mizrahi <smizrahi>
Status: CLOSED CURRENTRELEASE QA Contact: Gadi Ickowicz <gickowic>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.3CC: abaron, bazulay, cpelland, ewarszaw, fsimonce, gickowic, iheim, ilvovsky, lpeer, nlevinki, smizrahi, yeylon, ykaul
Target Milestone: betaKeywords: Regression, Reopened, TestBlocker
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: infra
Fixed In Version: vdsm-4.9.6-29.0.el6_3.x86_64 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-12-05 07:35:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
vdsm logs
none
vdsm and engine logs
none
vdsm + engine logs none

Description Haim 2012-07-26 14:01:17 UTC
Description of problem:

I have a case where I try to create template from VM and task is failing on the following error:

095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,641::lvm::393::OperationMutex::(_reloadlvs) Operation 'lvm reload operation' got the operation mutex
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,642::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /sbin/lvm lvs --config " devices { preferred_names = [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ \\"a%1qe-export_import31343141|3514f0c5d6620000c|3514f0c5d6620026f|3514f0c5d66200270|3514f0c5d66200271|3514f0c5d66200272|3514f0c5d66200273%\\", \\"r%.*%\\" ] }  global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {  retain_min = 50  retain_days = 0 } " --noheadings --units b --nosuffix --separator | -o uuid,name,vg_name,attr,size,seg_start_pe,devices,tags a3ceac0d-a435-4306-bd61-2465ef9525f9' (cwd None)
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,896::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = '  /dev/mapper/1qe-export_import31343141: read failed after 0 of 4096 at 53687025664: Input/output error\n  /dev/mapper/1qe-export_import31343141: read failed after 0 of 4096 at 53687083008: Input/output error\n  /dev/mapper/1qe-export_import31343141: read failed after 0 of 4096 at 0: Input/output error\n  WARNING: Error counts reached a limit of 3. Device /dev/mapper/1qe-export_import31343141 was disabled\n'; <rc> = 0
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,913::lvm::422::OperationMutex::(_reloadlvs) Operation 'lvm reload operation' released the operation mutex
095bf145-27e5-44a5-9ba7-96d85143a5f0::WARNING::2012-07-26 19:10:12,913::lvm::573::Storage.LVM::(getLv) lv: dcaf11d7-e425-4848-b092-6243ae91b58b not found in lvs vg: a3ceac0d-a435-4306-bd61-2465ef9525f9 response
095bf145-27e5-44a5-9ba7-96d85143a5f0::INFO::2012-07-26 19:10:12,914::image::94::Storage.Image::(create) Create placeholder /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/a3ceac0d-a435-4306-bd61-2465ef9525f9/images/c6ae1c85-6812-404b-a2f5-6221ed204c10 for image's volumes
095bf145-27e5-44a5-9ba7-96d85143a5f0::ERROR::2012-07-26 19:10:12,915::image::767::Storage.Image::(copy) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/image.py", line 751, in copy
    srcImgUUID=volume.BLANK_UUID, srcVolUUID=volume.BLANK_UUID)
  File "/usr/share/vdsm/storage/blockSD.py", line 518, in createVolume
    volUUID, desc, srcImgUUID, srcVolUUID)
  File "/usr/share/vdsm/storage/blockVolume.py", line 176, in create
    imageDir = image.Image(repoPath).create(sdUUID, imgUUID)
  File "/usr/share/vdsm/storage/image.py", line 97, in create
    [imageDir]))
  File "/usr/share/vdsm/storage/task.py", line 1049, in pushRecovery
    self.persist()
  File "/usr/share/vdsm/storage/task.py", line 1124, in persist
    self._save(self.store)
  File "/usr/share/vdsm/storage/task.py", line 738, in _save
    raise se.TaskDirError("_save: no such task dir '%s'" % origTaskDir)
TaskDirError: can't find/access task dir: ("_save: no such task dir '/rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0'",)
095bf145-27e5-44a5-9ba7-96d85143a5f0::INFO::2012-07-26 19:10:12,916::blockVolume::423::Storage.Volume::(teardown) Tearing down volume a3ceac0d-a435-4306-bd61-2465ef9525f9/ffe9e27e-dab6-4d78-af60-f5f9247110b8 justme False
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,917::resourceManager::538::ResourceManager::(releaseResource) Trying to release resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_lvmActivationNS.ffe9e27e-dab6-4d78-af60-f5f9247110b8'
095bf145-27e5-44a5-9ba7-96d85143a5f0::ERROR::2012-07-26 19:10:12,917::image::662::Storage.Image::(__cleanupCopy) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/image.py", line 658, in __cleanupCopy
    srcVol.teardown(sdUUID=srcVol.sdUUID, volUUID=srcVol.volUUID)
  File "/usr/share/vdsm/storage/blockVolume.py", line 425, in teardown
    rmanager.releaseResource(lvmActivationNamespace, volUUID)
  File "/usr/share/vdsm/storage/resourceManager.py", line 550, in releaseResource
    raise ValueError("Resource '%s.%s' is not currently registered" % (namespace, name))
ValueError: Resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_lvmActivationNS.ffe9e27e-dab6-4d78-af60-f5f9247110b8' is not currently registered
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,918::resourceManager::538::ResourceManager::(releaseResource) Trying to release resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_imageNS.c6ae1c85-6812-404b-a2f5-6221ed204c10'
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,919::resourceManager::553::ResourceManager::(releaseResource) Released resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_imageNS.c6ae1c85-6812-404b-a2f5-6221ed204c10' (0 active users)
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,919::resourceManager::558::ResourceManager::(releaseResource) Resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_imageNS.c6ae1c85-6812-404b-a2f5-6221ed204c10' is free, finding out if anyone is waiting for it.
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,919::resourceManager::565::ResourceManager::(releaseResource) No one is waiting for resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_imageNS.c6ae1c85-6812-404b-a2f5-6221ed204c10', Clearing records.
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,920::resourceManager::538::ResourceManager::(releaseResource) Trying to release resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_imageNS.504a4662-08be-4d1b-9aef-911d4ac7f134'
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,921::resourceManager::553::ResourceManager::(releaseResource) Released resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_imageNS.504a4662-08be-4d1b-9aef-911d4ac7f134' (0 active users)
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,921::resourceManager::558::ResourceManager::(releaseResource) Resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_imageNS.504a4662-08be-4d1b-9aef-911d4ac7f134' is free, finding out if anyone is waiting for it.
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,921::resourceManager::538::ResourceManager::(releaseResource) Trying to release resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_volumeNS.ffe9e27e-dab6-4d78-af60-f5f9247110b8'
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,922::resourceManager::553::ResourceManager::(releaseResource) Released resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_volumeNS.ffe9e27e-dab6-4d78-af60-f5f9247110b8' (0 active users)
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,922::resourceManager::558::ResourceManager::(releaseResource) Resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_volumeNS.ffe9e27e-dab6-4d78-af60-f5f9247110b8' is free, finding out if anyone is waiting for it.
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,923::resourceManager::565::ResourceManager::(releaseResource) No one is waiting for resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_volumeNS.ffe9e27e-dab6-4d78-af60-f5f9247110b8', Clearing records.
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,923::resourceManager::565::ResourceManager::(releaseResource) No one is waiting for resource 'a3ceac0d-a435-4306-bd61-2465ef9525f9_imageNS.504a4662-08be-4d1b-9aef-911d4ac7f134', Clearing records.
095bf145-27e5-44a5-9ba7-96d85143a5f0::ERROR::2012-07-26 19:10:12,924::task::853::TaskManager.Task::(_setError) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 320, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/securable.py", line 63, in wrapper
    return f(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1643, in copyImage
    volType, volFormat, preallocate, postZero, force)
  File "/usr/share/vdsm/storage/image.py", line 768, in copy
    raise se.CopyImageError("Destination volume %s error: %s" % (dstVolUUID, str(e)))
CopyImageError: low level Image copy failed: ('Destination volume dcaf11d7-e425-4848-b092-6243ae91b58b error: can\'t find/access task dir: ("_save: no such task dir \'/rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0\'",)',)
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,924::task::872::TaskManager.Task::(_run) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Task._run: 095bf145-27e5-44a5-9ba7-96d85143a5f0 () {} failed - stopping task
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,925::task::1199::TaskManager.Task::(stop) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::stopping in state running (force False)
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,925::task::978::TaskManager.Task::(_decref) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::ref 1 aborting True
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,926::task::903::TaskManager.Task::(_runJobs) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::aborting: Task is aborted: 'low level Image copy failed' - code 261
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,926::task::978::TaskManager.Task::(_decref) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::ref 0 aborting True
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,926::task::913::TaskManager.Task::(_doAbort) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Task._doAbort: force False
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,927::resourceManager::844::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,927::task::588::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::moving from state running -> state aborting
095bf145-27e5-44a5-9ba7-96d85143a5f0::WARNING::2012-07-26 19:10:12,928::task::597::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Task._updateState: failed persisting task 095bf145-27e5-44a5-9ba7-96d85143a5f0
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 595, in _updateState
    self.persist()
  File "/usr/share/vdsm/storage/task.py", line 1124, in persist
    self._save(self.store)
  File "/usr/share/vdsm/storage/task.py", line 738, in _save
    raise se.TaskDirError("_save: no such task dir '%s'" % origTaskDir)
TaskDirError: can't find/access task dir: ("_save: no such task dir '/rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0'",)
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,929::task::537::TaskManager.Task::(__state_aborting) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::_aborting: recover policy auto
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,929::task::588::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::moving from state aborting -> state racquiring
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,930::task::740::TaskManager.Task::(_save) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::_save: orig /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0 temp /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0.temp
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,935::task::588::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::moving from state racquiring -> state recovering
095bf145-27e5-44a5-9ba7-96d85143a5f0::WARNING::2012-07-26 19:10:12,936::task::597::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Task._updateState: failed persisting task 095bf145-27e5-44a5-9ba7-96d85143a5f0
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 595, in _updateState
    self.persist()
  File "/usr/share/vdsm/storage/task.py", line 1124, in persist
    self._save(self.store)
  File "/usr/share/vdsm/storage/task.py", line 738, in _save
    raise se.TaskDirError("_save: no such task dir '%s'" % origTaskDir)
TaskDirError: can't find/access task dir: ("_save: no such task dir '/rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0'",)
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,937::task::789::TaskManager.Task::(_recover) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::_recover
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,937::task::795::TaskManager.Task::(_recover) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::running recovery create image rollback: c6ae1c85-6812-404b-a2f5-6221ed204c10: Image->createImageRollback(/rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/a3ceac0d-a435-4306-bd61-2465ef9525f9/images/c6ae1c85-6812-404b-a2f5-6221ed204c10)
095bf145-27e5-44a5-9ba7-96d85143a5f0::INFO::2012-07-26 19:10:12,938::image::75::Storage.Image::(createImageRollback) createImageRollback: imageDir=/rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/a3ceac0d-a435-4306-bd61-2465ef9525f9/images/c6ae1c85-6812-404b-a2f5-6221ed204c10
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,938::task::795::TaskManager.Task::(_recover) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::running recovery None
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,939::task::776::TaskManager.Task::(_recoverDone) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Recover Done: state recovering
095bf145-27e5-44a5-9ba7-96d85143a5f0::DEBUG::2012-07-26 19:10:12,939::task::588::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::moving from state recovering -> state recovered
095bf145-27e5-44a5-9ba7-96d85143a5f0::WARNING::2012-07-26 19:10:12,940::task::597::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Task._updateState: failed persisting task 095bf145-27e5-44a5-9ba7-96d85143a5f0
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 595, in _updateState
    self.persist()
  File "/usr/share/vdsm/storage/task.py", line 1124, in persist
    self._save(self.store)
  File "/usr/share/vdsm/storage/task.py", line 738, in _save
    raise se.TaskDirError("_save: no such task dir '%s'" % origTaskDir)


reproduction:

case 1:

- create vm
- start vm 
- create snapshot (live)
- clone vm from snapshot
- create template from cloned vm

case 2:

- create vm
- start vm
- create snapshot (live)
- stop vm
- create template from vm

3b6ea46c-d161-40f5-a900-c45f22aa7089::DEBUG::2012-07-26 19:23:40,249::volume::827::Storage.Volume::(qemuConvert) (qemuConvert): COPY /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/a3ceac0d-a435-4306-bd6
1-2465ef9525f9/images/597d8a7f-f968-49d6-a6d0-34a6931f9f7a/fd9d9ab9-dc4e-4e9c-99ca-4fd8c970d2e6 (qcow2) to /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/a3ceac0d-a435-4306-bd61-2465ef9525f9/images/9be5
0d0c-4e39-445f-a4d5-aa8ec0f48859/2bb1495a-6985-4f2b-921a-e25e402a96d2 (qcow2) START
3b6ea46c-d161-40f5-a900-c45f22aa7089::DEBUG::2012-07-26 19:23:40,250::__init__::1164::Storage.Misc.excCmd::(_log) '/bin/nice -n 19 /usr/bin/ionice -c 2 -n 7 /usr/bin/qemu-img convert -t none -f qcow2 /rhev/data
-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/a3ceac0d-a435-4306-bd61-2465ef9525f9/images/597d8a7f-f968-49d6-a6d0-34a6931f9f7a/fd9d9ab9-dc4e-4e9c-99ca-4fd8c970d2e6 -O qcow2 /rhev/data-center/4c4eef8f-5a80-4ca0-b
a5a-0f810c89393d/a3ceac0d-a435-4306-bd61-2465ef9525f9/images/9be50d0c-4e39-445f-a4d5-aa8ec0f48859/2bb1495a-6985-4f2b-921a-e25e402a96d2' (cwd None)
3b6ea46c-d161-40f5-a900-c45f22aa7089::DEBUG::2012-07-26 19:23:40,257::task::740::TaskManager.Task::(_save) Task=`3b6ea46c-d161-40f5-a900-c45f22aa7089`::_save: orig /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f81
0c89393d/mastersd/master/tasks/3b6ea46c-d161-40f5-a900-c45f22aa7089 temp /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/3b6ea46c-d161-40f5-a900-c45f22aa7089.temp
3b6ea46c-d161-40f5-a900-c45f22aa7089::ERROR::2012-07-26 19:23:40,269::image::797::Storage.Image::(copy) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/image.py", line 782, in copy
    raise e
ActionStopped: Action was stopped: ()

3b6ea46c-d161-40f5-a900-c45f22aa7089::ERROR::2012-07-26 19:23:44,202::task::853::TaskManager.Task::(_setError) Task=`3b6ea46c-d161-40f5-a900-c45f22aa7089`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 320, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/securable.py", line 63, in wrapper
    return f(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1643, in copyImage
    volType, volFormat, preallocate, postZero, force)
  File "/usr/share/vdsm/storage/image.py", line 798, in copy
    raise se.CopyImageError("src image=%s, dst image=%s: msg=%s" % (srcImgUUID, dstImgUUID, str(e)))
CopyImageError: low level Image copy failed: ('src image=597d8a7f-f968-49d6-a6d0-34a6931f9f7a, dst image=9be50d0c-4e39-445f-a4d5-aa8ec0f48859: msg=Action was stopped: ()',)
3b6ea46c-d161-40f5-a900-c45f22aa7089::DEBUG::2012-07-26 19:23:44,204::task::872::TaskManager.Task::(_run) Task=`3b6ea46c-d161-40f5-a900-c45f22aa7089`::Task._run: 3b6ea46c-d161-40f5-a900-c45f22aa7089 () {} failed - stopping task

Comment 1 Haim 2012-07-26 14:03:04 UTC
Created attachment 600514 [details]
vdsm logs

Comment 4 Ayal Baron 2012-08-01 20:57:06 UTC
Haim, not sure it's related, but you clearly have a storage problem:

19:10:12,896::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = '  /dev/mapper/1qe-export_import31343141: read failed after 0 of 4096 at 53687025664: Input/output error\n  /dev/mapper/1qe-export_import31343141: read failed after 0 of 4096 at 53687083008: Input/output error\n  /dev/mapper/1qe-export_import31343141: read failed after 0 of 4096 at 0: Input/output error\n  WARNING: Error counts reached a limit of 3. Device /dev/mapper/1qe-export_import31343141 was disabled\n';

Comment 5 Ayal Baron 2012-08-01 21:15:53 UTC
This indeed looks like a storage issue,

Task file couldn't be saved on:
/rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks

Haim, can you try and reproduce on another (clean) host?
If I'm right then this is not a beta blocker (if a bug at all)

Thread-95144::DEBUG::2012-07-26 19:10:11,645::task::740::TaskManager.Task::(_save) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::_save: orig /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0 temp /rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0.temp
Thread-95144::DEBUG::2012-07-26 19:10:11,649::taskManager::66::TaskManager::(scheduleJob) scheduled job copyImage for task 095bf145-27e5-44a5-9ba7-96d85143a5f0 
Thread-95144::INFO::2012-07-26 19:10:11,649::logUtils::39::dispatcher::(wrapper) Run and protect: copyImage, Return response: None
Thread-95144::DEBUG::2012-07-26 19:10:11,650::task::1167::TaskManager.Task::(prepare) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Prepare: 1 jobs exist, move to acquiring
Thread-95144::DEBUG::2012-07-26 19:10:11,650::task::588::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::moving from state preparing -> state acquiring
Thread-95144::WARNING::2012-07-26 19:10:11,651::task::597::TaskManager.Task::(_updateState) Task=`095bf145-27e5-44a5-9ba7-96d85143a5f0`::Task._updateState: failed persisting task 095bf145-27e5-44a5-9ba7-96d85143a5f0
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 595, in _updateState
    self.persist()
  File "/usr/share/vdsm/storage/task.py", line 1124, in persist
    self._save(self.store)
  File "/usr/share/vdsm/storage/task.py", line 738, in _save
    raise se.TaskDirError("_save: no such task dir '%s'" % origTaskDir)
TaskDirError: can't find/access task dir: ("_save: no such task dir '/rhev/data-center/4c4eef8f-5a80-4ca0-ba5a-0f810c89393d/mastersd/master/tasks/095bf145-27e5-44a5-9ba7-96d85143a5f0'",)

Comment 7 Haim 2012-08-14 10:47:23 UTC
Gadi managed to reproduce this issue with automatic testing utilizing the following flow:

- create iSCSI data domain
- attach and activate ISO domain 
- create a new VM 
- start a new VM 
- block connection to ISO domain until domain turns inactive 
- once domain becomes inactive sleep 90
- unblock connection
- sleep 180
- destroy vm (stop vm if running)
- remove vm command (which produces a deleteImage)
- new task is created 9d2b6071-8c24-4669-a7a0-549c8b92aaf7 
- operation is failing in vdsm (attached)
- task dir is accessible !!!

[root@cougar04 ~]# tree /rhev/data-center/480b8850-6eab-489d-9184-4ed6df866e1b/mastersd/master/tasks/
/rhev/data-center/480b8850-6eab-489d-9184-4ed6df866e1b/mastersd/master/tasks/
├── 2a261182-3ac4-4ae2-ada3-4da1db522060
└── 9d2b6071-8c24-4669-a7a0-549c8b92aaf7

2 directories, 0 files
[root@cougar04 ~]# ls -l /rhev/data-center/480b8850-6eab-489d-9184-4ed6df866e1b/mastersd/master/tasks/
total 8
drwxr-xr-x. 2 vdsm kvm 4096 Aug 14 12:52 2a261182-3ac4-4ae2-ada3-4da1db522060
drwxr-xr-x. 2 vdsm kvm 4096 Aug 14 12:16 9d2b6071-8c24-4669-a7a0-549c8b92aaf7

- delete vm succeeds after vdsm restart


Thread-865::DEBUG::2012-08-14 12:16:51,080::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /sbin/lvm lvremove --config " devices { preferred_names = [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 
write_cache_state=0 disable_after_error_count=3 filter = [ \\"a%3514f0c5695800001|3514f0c5695800002|3514f0c5695800003|3514f0c5695800004|3514f0c5695800005|3600144f06572cd0000005028a5790407|3600144f06572cd0000005028
d2990418|3600144f06572cd000000502908bd041d|3600144f06572cd000000502a11130432%\\", \\"r%.*%\\" ] }  global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {  retain_min = 50  retain_days = 0
 } " -f --autobackup n 65efce35-6970-4287-9c08-07a88d5edf4b/f3ad7783-38b9-40e5-b4ac-fd2377c2c69e' (cwd None)
Thread-865::DEBUG::2012-08-14 12:16:51,416::__init__::1164::Storage.Misc.excCmd::(_log) SUCCESS: <err> = '  WARNING: This metadata update is NOT backed up\n'; <rc> = 0
Thread-865::DEBUG::2012-08-14 12:16:51,417::lvm::467::OperationMutex::(_invalidatevgs) Operation 'lvm invalidate operation' got the operation mutex
Thread-865::DEBUG::2012-08-14 12:16:51,418::lvm::469::OperationMutex::(_invalidatevgs) Operation 'lvm invalidate operation' released the operation mutex
Thread-865::ERROR::2012-08-14 12:16:51,419::task::853::TaskManager.Task::(_setError) Task=`9d2b6071-8c24-4669-a7a0-549c8b92aaf7`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1290, in deleteImage
    self._spmSchedule(spUUID, "deleteImage", lambda : True)
  File "/usr/share/vdsm/storage/hsm.py", line 671, in _spmSchedule
    self.taskMng.scheduleJob("spm", pool.tasksDir, vars.task, name, func, *args)
  File "/usr/share/vdsm/storage/taskManager.py", line 62, in scheduleJob
    task.setPersistence(store, cleanPolicy=TaskCleanType.manual)
  File "/usr/share/vdsm/storage/task.py", line 1099, in setPersistence
    self.persist()
  File "/usr/share/vdsm/storage/task.py", line 1124, in persist
    self._save(self.store)
  File "/usr/share/vdsm/storage/task.py", line 738, in _save
    raise se.TaskDirError("_save: no such task dir '%s'" % origTaskDir)
TaskDirError: can't find/access task dir: ("_save: no such task dir '/rhev/data-center/480b8850-6eab-489d-9184-4ed6df866e1b/mastersd/master/tasks/9d2b6071-8c24-4669-a7a0-549c8b92aaf7'",)

Comment 9 Gadi Ickowicz 2012-08-14 10:57:54 UTC
Created attachment 604262 [details]
vdsm and engine logs

Comment 14 Ayal Baron 2012-08-17 12:14:43 UTC
Patches merged:
Iff1c83a72b642141afad6142db49f98d39dc141f
Ibbc4a76d3aa9547fa8b4fc284b8bd43cd37a0dba
I711d0d4f627a1f42886a65305f93bcd97deccd3f

Comment 15 Gadi Ickowicz 2012-08-22 08:17:10 UTC
Created attachment 606165 [details]
vdsm + engine logs

The scenario I used before now succeeds on vdsm-4.9.6-29.0.el6_3.x86_64.

logs attached

Comment 16 Barak 2012-09-10 13:20:48 UTC
Per comment #14,

Moving to ON_QA as this scenario needs to be verified.

Comment 17 Gadi Ickowicz 2012-10-14 14:21:54 UTC
Verified - as per comment #15, and again on:

vdsm-4.9.6-37.0.el6_3.x86_64.