| Summary: | [vdsm] When migrating multiple VMs unsuccessfully, "call destroy" failed to release the image lock on the destenation migration. | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | David Naori <dnaori> | ||||
| Component: | vdsm | Assignee: | Igor Lvovsky <ilvovsky> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | yeylon <yeylon> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | low | ||||||
| Version: | 6.1 | CC: | abaron, bazulay, dkenigsb, dnaori, hateya, iheim, lpeer, mgoldboi, srevivo | ||||
| Target Milestone: | rc | ||||||
| Target Release: | 6.1 | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | vdsm-4.9-47.el6 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2011-08-19 15:23:42 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
David please attach full logs Created attachment 475173 [details]
vdsm-log
Verified |
Description of problem: When migrating 30 VMs simultaneously, some are hanging in "migrating from" status in the source and getting back up after approximately 10 minutes. in the destenation "call destroy" failed to release the image lock. Vdsm log: Thread-4737::DEBUG::2011-01-24 16:14:18,079::clientIF::40::vds::(wrapper) [10.35.116.10]::call destroy with ('3e2681e7-dc43-409b-b5df-709f8f3d19c0',) {} Thread-4737::DEBUG::2011-01-24 16:14:18,080::libvirtvm::931::vds.vmlog.3e2681e7-dc43-409b-b5df-709f8f3d19c0::(destroy) destroy Called Thread-4737::WARNING::2011-01-24 16:14:18,080::libvirtvm::938::vds.vmlog.3e2681e7-dc43-409b-b5df-709f8f3d19c0::(destroy) <<<<< Destroy are welcome >>>>> Thread-4737::WARNING::2011-01-24 16:14:18,081::libvirtvm::953::vds.vmlog.3e2681e7-dc43-409b-b5df-709f8f3d19c0::(destroy) Traceback (most recent call last): File "/usr/share/vdsm/libvirtvm.py", line 947, in destroy self._dom.destroy() File "/usr/share/vdsm/libvirtvm.py", line 155, in f raise e libvirtError: Domain not found: no domain with matching uuid '3e2681e7-dc43-409b-b5df-709f8f3d19c0' Thread-4737::DEBUG::2011-01-24 16:14:18,081::utils::570::Storage.Misc.excCmd::(execCmd) '/usr/bin/sudo -n /sbin/service ksmtuned retune' (cwd None) Thread-4737::DEBUG::2011-01-24 16:14:18,113::utils::570::Storage.Misc.excCmd::(execCmd) SUCCESS: <err> = ''; <rc> = 0 Thread-4737::DEBUG::2011-01-24 16:14:18,114::libvirtvm::966::vds.vmlog.3e2681e7-dc43-409b-b5df-709f8f3d19c0::(destroy) Total desktops after destroy of 3e2681e7-dc43-409b-b5df-709f8f3d19c0 is 6 Thread-4737::INFO::2011-01-24 16:14:18,114::dispatcher::95::Storage.Dispatcher.Protect::(run) Run and protect: teardownVolume, args: ( sdUUID=5264fc3d-c4cb-4e86-be4b-144a905a5fdc spUUID=4e0cb55f-c43e-4e3c-aa9e-13334e4469b1 imgUUID=b48d068d-3d31-4889-9bd1-9ba4e7d9a04b volUUID=fec97928-2f0e-4873-9b72-30fa228fffd0 rw=True) Thread-4737::DEBUG::2011-01-24 16:14:18,115::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: moving from state init -> state preparing Thread-4737::INFO::2011-01-24 16:14:18,115::sp::1349::Storage.StoragePool::(getDomains) Get storage pool domains: {'5264fc3d-c4cb-4e86-be4b-144a905a5fdc': 'Active', '231f7bb3-0b77-481c-8dc8-9b32271e09fe': 'Active'} Thread-4737::INFO::2011-01-24 16:14:18,115::fileSD::184::Storage.StorageDomain::(validate) sdUUID=5264fc3d-c4cb-4e86-be4b-144a905a5fdc Thread-4737::DEBUG::2011-01-24 16:14:18,117::metadata::88::Storage.Metadata::(__validate) metadata.__validate() __cksum(md) = 078ef9aab0d6ddbf7795cebecb390baccc487466 Thread-4737::DEBUG::2011-01-24 16:14:18,118::resourceManager::154::ResourceManager.Request::(__init__) ResName=`Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc`ReqID=`2ef10eef-f410-490b-90e0-a997a5fe6dfe`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '1714' at 'public_teardownVolume' Thread-4737::DEBUG::2011-01-24 16:14:18,118::resourceManager::467::ResourceManager::(registerResource) Trying to register resource 'Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc' for lock type 'shared' Thread-4737::DEBUG::2011-01-24 16:14:18,118::resourceManager::508::ResourceManager::(registerResource) Resource 'Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc' is free. Now locking as 'shared' (1 active user) Thread-4737::DEBUG::2011-01-24 16:14:18,119::resourceManager::191::ResourceManager.Request::(grant) ResName=`Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc`ReqID=`2ef10eef-f410-490b-90e0-a997a5fe6dfe`::Granted request Thread-4737::DEBUG::2011-01-24 16:14:18,119::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: _resourcesAcquired: Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc (shared) Thread-4737::DEBUG::2011-01-24 16:14:18,120::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: ref 1 aborting False Thread-4737::DEBUG::2011-01-24 16:14:18,120::fileVolume::526::Storage.Volume::(validateVolumePath) validate path for fec97928-2f0e-4873-9b72-30fa228fffd0 Thread-4737::ERROR::2011-01-24 16:14:18,121::task::854::TaskManager.Task::(_setError) Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 862, in _run return fn(*args, **kargs) File "/usr/share/vdsm/storage/hsm.py", line 1715, in public_teardownVolume SDF.produce(sdUUID=sdUUID).produceVolume(imgUUID=imgUUID, volUUID=volUUID).teardown() File "/usr/share/vdsm/storage/fileSD.py", line 138, in produceVolume return fileVolume.FileVolume(repoPath, self.sdUUID, imgUUID, volUUID) File "/usr/share/vdsm/storage/fileVolume.py", line 32, in __init__ volume.Volume.__init__(self, repoPath, sdUUID, imgUUID, volUUID) File "/usr/share/vdsm/storage/volume.py", line 113, in __init__ self.validate() File "/usr/share/vdsm/storage/volume.py", line 120, in validate self.validateVolumePath() File "/usr/share/vdsm/storage/fileVolume.py", line 537, in validateVolumePath self.validateMetaVolumePath() File "/usr/share/vdsm/storage/fileVolume.py", line 546, in validateMetaVolumePath raise se.VolumeDoesNotExist(self.volUUID) VolumeDoesNotExist: Volume does not exist: ('fec97928-2f0e-4873-9b72-30fa228fffd0',) Thread-4737::DEBUG::2011-01-24 16:14:18,123::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: Task._run: 13f5d9d2-6eca-40a4-a798-680db9a5420c ('5264fc3d-c4cb-4e86-be4b-144a905a5fdc', '4e0cb55f-c43e-4e3c-aa9e-13334e4469b1', 'b48d068d-3d31-4889-9bd1-9ba4e7d9a04b', 'fec97928-2f0e-4873-9b72-30fa228fffd0', True) {} failed - stopping task Thread-4737::DEBUG::2011-01-24 16:14:18,123::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: stopping in state preparing (force False) Thread-4737::DEBUG::2011-01-24 16:14:18,124::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: ref 1 aborting True Thread-4737::INFO::2011-01-24 16:14:18,124::task::1166::TaskManager.Task::(prepare) aborting: Task is aborted: 'Volume does not exist' - code 201 Thread-4737::DEBUG::2011-01-24 16:14:18,124::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: Prepare: aborted: Volume does not exist Thread-4737::DEBUG::2011-01-24 16:14:18,125::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: ref 0 aborting True Thread-4737::DEBUG::2011-01-24 16:14:18,125::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: Task._doAbort: force False Thread-4737::DEBUG::2011-01-24 16:14:18,125::resourceManager::821::irs::(cancelAll) Owner.cancelAll requests {} Thread-4737::DEBUG::2011-01-24 16:14:18,125::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: moving from state preparing -> state aborting Thread-4737::DEBUG::2011-01-24 16:14:18,126::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: _aborting: recover policy none Thread-4737::DEBUG::2011-01-24 16:14:18,126::task::491::TaskManager.Task::(_debug) Task 13f5d9d2-6eca-40a4-a798-680db9a5420c: moving from state aborting -> state failed Thread-4737::DEBUG::2011-01-24 16:14:18,126::resourceManager::786::irs::(releaseAll) Owner.releaseAll requests {} resources {'Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc': < ResourceRef 'Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc', isValid: 'True' obj: 'None'>} Thread-4737::DEBUG::2011-01-24 16:14:18,127::resourceManager::821::irs::(cancelAll) Owner.cancelAll requests {} Thread-4737::DEBUG::2011-01-24 16:14:18,127::resourceManager::517::ResourceManager::(releaseResource) Trying to release resource 'Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc' Thread-4737::DEBUG::2011-01-24 16:14:18,127::resourceManager::532::ResourceManager::(releaseResource) Released resource 'Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc' (0 active users) Thread-4737::DEBUG::2011-01-24 16:14:18,127::resourceManager::537::ResourceManager::(releaseResource) Resource 'Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc' is free, finding out if anyone is waiting for it. Thread-4737::DEBUG::2011-01-24 16:14:18,128::resourceManager::544::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.5264fc3d-c4cb-4e86-be4b-144a905a5fdc', Clearing records. Thread-4737::ERROR::2011-01-24 16:14:18,128::dispatcher::104::Storage.Dispatcher.Protect::(run) {'status': {'message': "Volume does not exist: ('fec97928-2f0e-4873-9b72-30fa228fffd0',)", 'code': 201}, 'args': [('fec97928-2f0e-4873-9b72-30fa228fffd0',)]} Thread-4737::DEBUG::2011-01-24 16:14:18,129::clientIF::45::vds::(wrapper) return destroy with {'status': {'message': 'Machine destroyed', 'code': 0}} Version-Release number of selected component (if applicable): -vdsm-4.9-43.1 -libvirt-0.8.7-3 -Rhevm 2.3 -rhel 6.1 based hosts repro steps: 1) Create 30 VMs from pool. 2) Migrate all of them together from one host to another.