Bug 1128631
Summary: | MergeVDSCommand fails when performing live snapshot deletion | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] oVirt | Reporter: | Raz Tamir <ratamir> | ||||||
Component: | ovirt-engine-core | Assignee: | Adam Litke <alitke> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Ori Gofen <ogofen> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | urgent | ||||||||
Version: | 3.5 | CC: | acanan, alitke, amureini, ecohen, gklein, gpadgett, iheim, ogofen, ratamir, rbalakri, yeylon | ||||||
Target Milestone: | --- | ||||||||
Target Release: | 3.5.0 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | storage | ||||||||
Fixed In Version: | ovirt-engine-3.5.0_rc1.1 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2014-10-17 12:45:01 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Please attach VDSM's log too Created attachment 925712 [details]
vdsm log
From the attached log I see the following important entries. Live merge was called twice in rapid succession by engine. Once called, vdsm properly handled both calls. The host does not have a capable version of libvirt so vdsm returns an error code. This smells like some sort of engine side race condition to me. Please attach the engine.log as well. Thread-65::DEBUG::2014-08-11 11:13:44,187::BindingXMLRPC::1127::vds::(wrapper) client [10.35.161.54]::call merge with ('42e19c61-73f6-4b8b-8478-07c0d0b6265a', {'domainID': 'ab1a30d2-6b93-4302-a028-a1f506f3b1da', 'volumeID': 'fd306b46-52b4-4715-a73f-a946d996adbf', 'poolID': 'f603339e-c4aa-474c-bb83-df768af662c8', 'imageID': '969ec9dc-a95e-4129-97f1-8465a1b79804'}, 'df6bc58a-105c-4964-9a56-caa268ebce5d', 'fd306b46-52b4-4715-a73f-a946d996adbf', '0', '0a93e33a-a4cc-4a0f-884f-233e85050d4b') {} flowID [80d09bd] Thread-58::DEBUG::2014-08-11 11:13:44,212::BindingXMLRPC::1127::vds::(wrapper) client [10.35.161.54]::call merge with ('42e19c61-73f6-4b8b-8478-07c0d0b6265a', {'domainID': '80f91f5c-f5cc-4ae2-953e-b2ef2f812a7e', 'volumeID': 'fd890dce-4d33-48f4-bb51-4f05ece61825', 'poolID': 'f603339e-c4aa-474c-bb83-df768af662c8', 'imageID': '8773e3b2-fe5a-4388-a8e0-a8f9d5217d67'}, 'f30d23dc-4ac5-45ec-b39d-64fb35f4772c', 'fd890dce-4d33-48f4-bb51-4f05ece61825', '0', 'ea9ef3dd-ce79-4cc1-9396-2d6ba3c92f32') {} flowID [37537b98] Thread-65::DEBUG::2014-08-11 11:13:44,342::vm::5566::vm.Vm::(merge) vmId=`42e19c61-73f6-4b8b-8478-07c0d0b6265a`::Starting merge with jobUUID='0a93e33a-a4cc-4a0f-884f-233e85050d4b' Thread-65::ERROR::2014-08-11 11:13:44,342::vm::5575::vm.Vm::(merge) vmId=`42e19c61-73f6-4b8b-8478-07c0d0b6265a`::Libvirt missing VIR_DOMAIN_BLOCK_COMMIT_RELATIVE. Unable to perform live merge. Thread-65::DEBUG::2014-08-11 11:13:44,343::BindingXMLRPC::1134::vds::(wrapper) return merge with {'status': {'message': 'Merge failed', 'code': 52}} Thread-58::DEBUG::2014-08-11 11:13:44,398::vm::5566::vm.Vm::(merge) vmId=`42e19c61-73f6-4b8b-8478-07c0d0b6265a`::Starting merge with jobUUID='ea9ef3dd-ce79-4cc1-9396-2d6ba3c92f32' Thread-58::ERROR::2014-08-11 11:13:44,399::vm::5575::vm.Vm::(merge) vmId=`42e19c61-73f6-4b8b-8478-07c0d0b6265a`::Libvirt missing VIR_DOMAIN_BLOCK_COMMIT_RELATIVE. Unable to perform live merge. Thread-58::DEBUG::2014-08-11 11:13:44,399::BindingXMLRPC::1134::vds::(wrapper) return merge with {'status': {'message': 'Merge failed', 'code': 52}} This looks just like what I experienced with builds not containing a commit [1] merged on August 7, which fixes the locked snapshot issue. Without it, Live Merge attempts won't converge, and will leave the disks illegal and snapshots locked. For an error case like this (libvirt not capable), it would leave things in an inoperable state. Can you retry with a build containing this commit? [1] 209ec823a03dd5838eed3d711fd821d2a1aba9dd core: Live Merge command hangs Hi Adam, engine.log attached (In reply to Greg Padgett from comment #4) > This looks just like what I experienced with builds not containing a commit > [1] merged on August 7, which fixes the locked snapshot issue. Without it, > Live Merge attempts won't converge, and will leave the disks illegal and > snapshots locked. For an error case like this (libvirt not capable), it > would leave things in an inoperable state. > > Can you retry with a build containing this commit? > > [1] 209ec823a03dd5838eed3d711fd821d2a1aba9dd core: Live Merge command hangs Moving to MODIFIED based on this comment - there is no such build publicly available yet. Raz - do you have the resources to give this a quick pre-integ run? Hi Allon, This operation is blocked right now. Is this the correct behaviour ? (In reply to ratamir from comment #7) > Hi Allon, > This operation is blocked right now. > Is this the correct behaviour ? Depends on the host - can you share the versions of qemu/libvirt/vdsm please? - vdsm-4.16.2-1.gite8cba75.el6.x86_64 - qemu-img-rhev-0.12.1.2-2.415.el6_5.14.x86_64 - qemu-kvm-rhev-0.12.1.2-2.415.el6_5.14.x86_64 - qemu-kvm-rhev-tools-0.12.1.2-2.415.el6_5.14.x86_64 - libvirt-0.10.2-29.el6_5.10.x86_64 These versions of libvirt and qemu lack support for live merge so in this case it's working as designed. See http://www.ovirt.org/Features/Live_Merge#IMPORTANT:_Special_environment_setup for information on testing the feature with Fedora 20 hosts. oVirt 3.5 has been released and should include the fix for this issue. |
Created attachment 925650 [details] engine log Description of problem: When performing live snapshot deletion, an ERROR messages are shown in engine log (log attached). The snapshot remains in lock state and the disks in that snapshot are in illegal status ** 2014-08-11 11:14:04,248 ERROR [org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskLiveCommand] (DefaultQuartzScheduler_Worker-15) [cff2e7] Merging of snapshot 095be0b6-d82d-4138-b91b-a1e49abe40be images f30d23dc-4ac5-45ec-b39d-64fb35f4772c..fd890dce-4d33-48f4-bb51-4f05ece61825 failed. Images have been marked illegal and can no longer be previewed or reverted to. Please retry Live Merge on the snapshot to complete the operation. Version-Release number of selected component (if applicable): ovirt-engine-3.5.0-0.0.master.20140804172041.git23b558e.el6.noarch How reproducible: 100% Steps to Reproduce: 1. Live delete snapshot 2. 3. Actual results: explained above Expected results: Additional info: