Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 672493

Summary: [vdsm] When migrating multiple VMs unsuccessfully, "call destroy" failed to release the image lock on the destenation migration.
Product: Red Hat Enterprise Linux 6 Reporter: David Naori <dnaori>
Component: vdsmAssignee: Igor Lvovsky <ilvovsky>
Status: CLOSED CURRENTRELEASE QA Contact: yeylon <yeylon>
Severity: high Docs Contact:
Priority: low    
Version: 6.1CC: abaron, bazulay, dkenigsb, dnaori, hateya, iheim, lpeer, mgoldboi, srevivo
Target Milestone: rc   
Target Release: 6.1   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: vdsm-4.9-47.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-08-19 15:23:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
vdsm-log none

Description David Naori 2011-01-25 10:45:05 UTC
Description of problem:

When migrating 30 VMs simultaneously, some are hanging in "migrating from" status in the source and getting back up after approximately 10 minutes.
in the destenation "call destroy" failed to release the image lock.

Vdsm log:

Thread-4737::DEBUG::2011-01-24 16:14:18,079::clientIF::40::vds::(wrapper) [10.35.116.10]::call destroy with ('3e2681e7-dc43-409b-b5df-709f8f3d19c0',) {}
Thread-4737::DEBUG::2011-01-24 16:14:18,080::libvirtvm::931::vds.vmlog.3e2681e7-dc43-409b-b5df-709f8f3d19c0::(destroy) destroy Called
Thread-4737::WARNING::2011-01-24 16:14:18,080::libvirtvm::938::vds.vmlog.3e2681e7-dc43-409b-b5df-709f8f3d19c0::(destroy) <<<<< Destroy are welcome >>>>>
Thread-4737::WARNING::2011-01-24 16:14:18,081::libvirtvm::953::vds.vmlog.3e2681e7-dc43-409b-b5df-709f8f3d19c0::(destroy) Traceback (most recent call last):
  File "/usr/share/vdsm/libvirtvm.py", line 947, in destroy
    self._dom.destroy()
  File "/usr/share/vdsm/libvirtvm.py", line 155, in f
    raise e
libvirtError: Domain not found: no domain with matching uuid '3e2681e7-dc43-409b-b5df-709f8f3d19c0'

Thread-4737::DEBUG::2011-01-24 16:14:18,081::utils::570::Storage.Misc.excCmd::(execCmd) '/usr/bin/sudo -n /sbin/service ksmtuned retune' (cwd None)
Thread-4737::DEBUG::2011-01-24 16:14:18,113::utils::570::Storage.Misc.excCmd::(execCmd) SUCCESS: <err> = ''; <rc> = 0
Thread-4737::DEBUG::2011-01-24 16:14:18,114::libvirtvm::966::vds.vmlog.3e2681e7-dc43-409b-b5df-709f8f3d19c0::(destroy) Total desktops after destroy of 3e2681e7-dc43-409b-b5df-709f8f3d19c0 is 6
Thread-4737::INFO::2011-01-24 16:14:18,114::dispatcher::95::Storage.Dispatcher.Protect::(run) Run and protect: teardownVolume, args: ( sdUUID=5264fc3d-c4cb-4e86-be4b-144a905a5fdc spUUID=4e0cb55f-c43e-4e3c-aa9e-13334e4469b1 imgUUID=b48d068d-3d31-4889-9bd1-9ba4e7d9a04b volUUID=fec97928-2f0e-4873-9b72-30fa228fffd0 rw=True)
Thread-4737::DEBUG::2011-01-24 16:14:18,115::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: moving from state init -> state preparing
Thread-4737::INFO::2011-01-24 16:14:18,115::sp::1349::Storage.StoragePool::(getDomains) Get storage pool domains: {'5264fc3d-c4cb-4e86-be4b-144a905a5fdc': 'Active', '231f7bb3-0b77-481c-8dc8-9b32271e09fe': 'Active'}
Thread-4737::INFO::2011-01-24 16:14:18,115::fileSD::184::Storage.StorageDomain::(validate) sdUUID=5264fc3d-c4cb-4e86-be4b-144a905a5fdc
Thread-4737::DEBUG::2011-01-24 16:14:18,117::metadata::88::Storage.Metadata::(__validate) metadata.__validate() __cksum(md) = 078ef9aab0d6ddbf7795cebecb390baccc487466
Thread-4737::DEBUG::2011-01-24 16:14:18,118::resourceManager::154::ResourceManager.Request::(__init__) ResName=`Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc`ReqID=`2ef10eef-f410-490b-90e0-a997a5fe6dfe`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '1714' at 'public_teardownVolume'
Thread-4737::DEBUG::2011-01-24 16:14:18,118::resourceManager::467::ResourceManager::(registerResource) Trying to register resource 'Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc' for lock type 'shared'
Thread-4737::DEBUG::2011-01-24 16:14:18,118::resourceManager::508::ResourceManager::(registerResource) Resource 'Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc' is free. Now locking as 'shared' (1 active user)
Thread-4737::DEBUG::2011-01-24 16:14:18,119::resourceManager::191::ResourceManager.Request::(grant) ResName=`Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc`ReqID=`2ef10eef-f410-490b-90e0-a997a5fe6dfe`::Granted request
Thread-4737::DEBUG::2011-01-24 16:14:18,119::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: _resourcesAcquired: Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc (shared)
Thread-4737::DEBUG::2011-01-24 16:14:18,120::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: ref 1 aborting False
Thread-4737::DEBUG::2011-01-24 16:14:18,120::fileVolume::526::Storage.Volume::(validateVolumePath) validate path for fec97928-2f0e-4873-9b72-30fa228fffd0
Thread-4737::ERROR::2011-01-24 16:14:18,121::task::854::TaskManager.Task::(_setError) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 862, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1715, in public_teardownVolume
    SDF.produce(sdUUID=sdUUID).produceVolume(imgUUID=imgUUID, volUUID=volUUID).teardown()
  File "/usr/share/vdsm/storage/fileSD.py", line 138, in produceVolume
    return fileVolume.FileVolume(repoPath, self.sdUUID, imgUUID, volUUID)
  File "/usr/share/vdsm/storage/fileVolume.py", line 32, in __init__
    volume.Volume.__init__(self, repoPath, sdUUID, imgUUID, volUUID)
  File "/usr/share/vdsm/storage/volume.py", line 113, in __init__
    self.validate()
  File "/usr/share/vdsm/storage/volume.py", line 120, in validate
    self.validateVolumePath()
  File "/usr/share/vdsm/storage/fileVolume.py", line 537, in validateVolumePath
    self.validateMetaVolumePath()
  File "/usr/share/vdsm/storage/fileVolume.py", line 546, in validateMetaVolumePath
    raise se.VolumeDoesNotExist(self.volUUID)
VolumeDoesNotExist: Volume does not exist: ('fec97928-2f0e-4873-9b72-30fa228fffd0',)
Thread-4737::DEBUG::2011-01-24 16:14:18,123::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: Task._run: 13f5d9d2-6eca-40a4-a798-680db9a5420c ('5264fc3d-c4cb-4e86-be4b-144a905a5fdc', '4e0cb55f-c43e-4e3c-aa9e-13334e4469b1', 'b48d068d-3d31-4889-9bd1-9ba4e7d9a04b', 'fec97928-2f0e-4873-9b72-30fa228fffd0', True) {} failed - stopping task
Thread-4737::DEBUG::2011-01-24 16:14:18,123::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: stopping in state preparing (force False)
Thread-4737::DEBUG::2011-01-24 16:14:18,124::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: ref 1 aborting True
Thread-4737::INFO::2011-01-24 16:14:18,124::task::1166::TaskManager.Task::(prepare) aborting: Task is aborted: 'Volume does not exist' - code 201
Thread-4737::DEBUG::2011-01-24 16:14:18,124::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: Prepare: aborted: Volume does not exist
Thread-4737::DEBUG::2011-01-24 16:14:18,125::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: ref 0 aborting True
Thread-4737::DEBUG::2011-01-24 16:14:18,125::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: Task._doAbort: force False
Thread-4737::DEBUG::2011-01-24 16:14:18,125::resourceManager::821::irs::(cancelAll) Owner.cancelAll requests {}
Thread-4737::DEBUG::2011-01-24 16:14:18,125::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: moving from state preparing -> state aborting
Thread-4737::DEBUG::2011-01-24 16:14:18,126::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: _aborting: recover policy none
Thread-4737::DEBUG::2011-01-24 16:14:18,126::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: moving from state aborting -> state failed
Thread-4737::DEBUG::2011-01-24 16:14:18,126::resourceManager::786::irs::(releaseAll) Owner.releaseAll requests {} resources {'Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc': < ResourceRef 'Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc', isValid: 'True' obj: 'None'>}
Thread-4737::DEBUG::2011-01-24 16:14:18,127::resourceManager::821::irs::(cancelAll) Owner.cancelAll requests {}
Thread-4737::DEBUG::2011-01-24 16:14:18,127::resourceManager::517::ResourceManager::(releaseResource) Trying to release resource 'Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc'
Thread-4737::DEBUG::2011-01-24 16:14:18,127::resourceManager::532::ResourceManager::(releaseResource) Released resource 'Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc' (0 active users)
Thread-4737::DEBUG::2011-01-24 16:14:18,127::resourceManager::537::ResourceManager::(releaseResource) Resource 'Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc' is free, finding out if anyone is waiting for it.
Thread-4737::DEBUG::2011-01-24 16:14:18,128::resourceManager::544::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc', Clearing records.
Thread-4737::ERROR::2011-01-24 16:14:18,128::dispatcher::104::Storage.Dispatcher.Protect::(run) {'status': {'message': "Volume does not exist: ('fec97928-2f0e-4873-9b72-30fa228fffd0',)", 'code': 201}, 'args': [('fec97928-2f0e-4873-9b72-30fa228fffd0',)]}
Thread-4737::DEBUG::2011-01-24 16:14:18,129::clientIF::45::vds::(wrapper) return destroy with {'status': {'message': 'Machine destroyed', 'code': 0}}


Version-Release number of selected component (if applicable):
-vdsm-4.9-43.1
-libvirt-0.8.7-3
-Rhevm 2.3
-rhel 6.1 based hosts


repro steps:

1) Create 30 VMs from pool.
2) Migrate all of them together from one host to another.

Comment 1 Moran Goldboim 2011-01-25 11:06:41 UTC
David please attach full logs

Comment 2 David Naori 2011-01-25 14:13:29 UTC
Created attachment 475173 [details]
vdsm-log

Comment 3 David Naori 2011-02-03 14:48:43 UTC
Verified