Bug 1128631 - MergeVDSCommand fails when performing live snapshot deletion
Summary: MergeVDSCommand fails when performing live snapshot deletion
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: oVirt
Classification: Retired
Component: ovirt-engine-core
Version: 3.5
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 3.5.0
Assignee: Adam Litke
QA Contact: Ori Gofen
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-08-11 09:03 UTC by Raz Tamir
Modified: 2016-02-10 19:43 UTC (History)
11 users (show)

Fixed In Version: ovirt-engine-3.5.0_rc1.1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-10-17 12:45:01 UTC
oVirt Team: Storage
Embargoed:


Attachments (Terms of Use)
engine log (869.61 KB, text/plain)
2014-08-11 09:03 UTC, Raz Tamir
no flags Details
vdsm log (786.62 KB, application/x-xz)
2014-08-11 12:03 UTC, Raz Tamir
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 31170 0 None None None Never

Description Raz Tamir 2014-08-11 09:03:12 UTC
Created attachment 925650 [details]
engine log

Description of problem:
When performing live snapshot deletion, an ERROR messages are shown in engine log (log attached).
The snapshot remains in lock state and the disks in that snapshot are in illegal status

** 2014-08-11 11:14:04,248 ERROR [org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskLiveCommand] (DefaultQuartzScheduler_Worker-15) [cff2e7] Merging of snapshot 095be0b6-d82d-4138-b91b-a1e49abe40be images f30d23dc-4ac5-45ec-b39d-64fb35f4772c..fd890dce-4d33-48f4-bb51-4f05ece61825 failed. Images have been marked illegal and can no longer be previewed or reverted to. Please retry Live Merge on the snapshot to complete the operation.


Version-Release number of selected component (if applicable):
ovirt-engine-3.5.0-0.0.master.20140804172041.git23b558e.el6.noarch

How reproducible:
100%

Steps to Reproduce:
1. Live delete snapshot
2.
3.

Actual results:
explained above

Expected results:


Additional info:

Comment 1 Allon Mureinik 2014-08-11 11:22:26 UTC
Please attach VDSM's log too

Comment 2 Raz Tamir 2014-08-11 12:03:59 UTC
Created attachment 925712 [details]
vdsm log

Comment 3 Adam Litke 2014-08-11 19:34:42 UTC
From the attached log I see the following important entries.  Live merge was called twice in rapid succession by engine.  Once called, vdsm properly handled both calls.  The host does not have a capable version of libvirt so vdsm returns an error code.  This smells like some sort of engine side race condition to me.  Please attach the engine.log as well.


Thread-65::DEBUG::2014-08-11 11:13:44,187::BindingXMLRPC::1127::vds::(wrapper) client [10.35.161.54]::call merge with ('42e19c61-73f6-4b8b-8478-07c0d0b6265a', {'domainID': 'ab1a30d2-6b93-4302-a028-a1f506f3b1da', 'volumeID': 'fd306b46-52b4-4715-a73f-a946d996adbf', 'poolID': 'f603339e-c4aa-474c-bb83-df768af662c8', 'imageID': '969ec9dc-a95e-4129-97f1-8465a1b79804'}, 'df6bc58a-105c-4964-9a56-caa268ebce5d', 'fd306b46-52b4-4715-a73f-a946d996adbf', '0', '0a93e33a-a4cc-4a0f-884f-233e85050d4b') {} flowID [80d09bd]


Thread-58::DEBUG::2014-08-11 11:13:44,212::BindingXMLRPC::1127::vds::(wrapper) client [10.35.161.54]::call merge with ('42e19c61-73f6-4b8b-8478-07c0d0b6265a', {'domainID': '80f91f5c-f5cc-4ae2-953e-b2ef2f812a7e', 'volumeID': 'fd890dce-4d33-48f4-bb51-4f05ece61825', 'poolID': 'f603339e-c4aa-474c-bb83-df768af662c8', 'imageID': '8773e3b2-fe5a-4388-a8e0-a8f9d5217d67'}, 'f30d23dc-4ac5-45ec-b39d-64fb35f4772c', 'fd890dce-4d33-48f4-bb51-4f05ece61825', '0', 'ea9ef3dd-ce79-4cc1-9396-2d6ba3c92f32') {} flowID [37537b98]


Thread-65::DEBUG::2014-08-11 11:13:44,342::vm::5566::vm.Vm::(merge) vmId=`42e19c61-73f6-4b8b-8478-07c0d0b6265a`::Starting merge with jobUUID='0a93e33a-a4cc-4a0f-884f-233e85050d4b'
Thread-65::ERROR::2014-08-11 11:13:44,342::vm::5575::vm.Vm::(merge) vmId=`42e19c61-73f6-4b8b-8478-07c0d0b6265a`::Libvirt missing VIR_DOMAIN_BLOCK_COMMIT_RELATIVE. Unable to perform live merge.
Thread-65::DEBUG::2014-08-11 11:13:44,343::BindingXMLRPC::1134::vds::(wrapper) return merge with {'status': {'message': 'Merge failed', 'code': 52}}


Thread-58::DEBUG::2014-08-11 11:13:44,398::vm::5566::vm.Vm::(merge) vmId=`42e19c61-73f6-4b8b-8478-07c0d0b6265a`::Starting merge with jobUUID='ea9ef3dd-ce79-4cc1-9396-2d6ba3c92f32'
Thread-58::ERROR::2014-08-11 11:13:44,399::vm::5575::vm.Vm::(merge) vmId=`42e19c61-73f6-4b8b-8478-07c0d0b6265a`::Libvirt missing VIR_DOMAIN_BLOCK_COMMIT_RELATIVE. Unable to perform live merge.
Thread-58::DEBUG::2014-08-11 11:13:44,399::BindingXMLRPC::1134::vds::(wrapper) return merge with {'status': {'message': 'Merge failed', 'code': 52}}

Comment 4 Greg Padgett 2014-08-11 20:09:07 UTC
This looks just like what I experienced with builds not containing a commit [1] merged on August 7, which fixes the locked snapshot issue.  Without it, Live Merge attempts won't converge, and will leave the disks illegal and snapshots locked.  For an error case like this (libvirt not capable), it would leave things in an inoperable state.

Can you retry with a build containing this commit?

[1] 209ec823a03dd5838eed3d711fd821d2a1aba9dd core: Live Merge command hangs

Comment 5 Raz Tamir 2014-08-12 07:03:09 UTC
Hi Adam,
engine.log attached

Comment 6 Allon Mureinik 2014-08-12 07:26:18 UTC
(In reply to Greg Padgett from comment #4)
> This looks just like what I experienced with builds not containing a commit
> [1] merged on August 7, which fixes the locked snapshot issue.  Without it,
> Live Merge attempts won't converge, and will leave the disks illegal and
> snapshots locked.  For an error case like this (libvirt not capable), it
> would leave things in an inoperable state.
> 
> Can you retry with a build containing this commit?
> 
> [1] 209ec823a03dd5838eed3d711fd821d2a1aba9dd core: Live Merge command hangs
Moving to MODIFIED based on this comment - there is no such build publicly available yet.

Raz - do you have the resources to give this a quick pre-integ run?

Comment 7 Raz Tamir 2014-08-24 14:39:59 UTC
Hi Allon,
This operation is blocked right now.
Is this the correct behaviour ?

Comment 8 Allon Mureinik 2014-08-24 19:47:37 UTC
(In reply to ratamir from comment #7)
> Hi Allon,
> This operation is blocked right now.
> Is this the correct behaviour ?
Depends on the host - can you share the versions of qemu/libvirt/vdsm please?

Comment 9 Raz Tamir 2014-08-25 06:34:19 UTC
- vdsm-4.16.2-1.gite8cba75.el6.x86_64

- qemu-img-rhev-0.12.1.2-2.415.el6_5.14.x86_64
- qemu-kvm-rhev-0.12.1.2-2.415.el6_5.14.x86_64
- qemu-kvm-rhev-tools-0.12.1.2-2.415.el6_5.14.x86_64

- libvirt-0.10.2-29.el6_5.10.x86_64

Comment 10 Adam Litke 2014-08-25 15:32:24 UTC
These versions of libvirt and qemu lack support for live merge so in this case it's working as designed.  See http://www.ovirt.org/Features/Live_Merge#IMPORTANT:_Special_environment_setup for information on testing the feature with Fedora 20 hosts.

Comment 11 Sandro Bonazzola 2014-10-17 12:45:01 UTC
oVirt 3.5 has been released and should include the fix for this issue.


Note You need to log in before you can comment on or make changes to this bug.