1081275 – If qemu dies after migration, vdsm doesn't clean up the vm

Bug 1081275 - If qemu dies after migration, vdsm doesn't clean up the vm

Summary: If qemu dies after migration, vdsm doesn't clean up the vm

Keywords:
Status:	CLOSED DUPLICATE of bug 985770
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	vdsm
Sub Component:
Version:	3.2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	3.3.2
Assignee:	Vinzenz Feenstra [evilissimo]
QA Contact:	meital avital
Docs Contact:
URL:
Whiteboard:	virt
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-03-26 23:54 UTC by David Gibson
Modified:	2019-04-28 10:04 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2014-04-02 00:13:22 UTC
oVirt Team:	---
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
VDSM log (8.20 MB, text/x-log) 2014-03-28 03:58 UTC, David Gibson	no flags	Details
libvirtd log (562.32 KB, application/x-xz) 2014-03-28 03:59 UTC, David Gibson	no flags	Details
qemu log for the VM in question (6.56 KB, text/x-log) 2014-03-28 03:59 UTC, David Gibson	no flags	Details
View All

Description David Gibson 2014-03-26 23:54:19 UTC

Description of problem:

In a customer case, it appears that qemu has died unexpectedly (we think) after a (successful) outgoing migration, before RHEV/vdsm could clean up the "stub" qemu instance.

vdsm then gets an error when it attempts to remove the "stub" VM.  A "Down" entry for the stub VM remains in vdsm's list indefinitely preventing further migrations.

Version-Release number of selected component (if applicable):

vdsm-4.10.2-24.1.el6ev.x86_64
libvirt-0.10.2-18.el6_4.9.x86_64
qemu-kvm-rhev-0.12.1.2-2.355.el6_4.7.x86_64

How reproducible:

Unknown.

Steps to Reproduce:

These steos aren't tested, but should work in theory.

1. Modify qemu to abort() after completing an outgoing migration
2. With the modified qemu on the source hypervisor, migrate a VM in RHEV.

Actual results:

a) RHEV won't permit the VM to be migrated again
b) On the migration source host, an entry still shows in vdsm for the VM.  e.g.

# vdsClient -s 0 getAllVmStates
[...]
XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
        Status = Down
        hash = 7231255206245563831
        exitMessage = Migration succeeded
        timeOffset = 0
        exitCode = 0

Expected results:

An error is still logged, due to the genuine problem with qemu, but the stub VM entry is still cleaned up and further migrations are permitted.

Additional info:

Comment 1 Omer Frenkel 2014-03-27 07:13:35 UTC

please attach vsdm,libvirt and qemu logs

Comment 2 David Gibson 2014-03-28 03:58:08 UTC

Created attachment 879723 [details]
VDSM log

Here's the vdsm.log file, the interesting events happen around 2014-03-19 13:51:12,329

Comment 3 David Gibson 2014-03-28 03:59:00 UTC

Created attachment 879724 [details]
libvirtd log

This is the libvirtd log.  (The rotation is the one that matches the interesting events in vdsm.log)

Comment 4 David Gibson 2014-03-28 03:59:38 UTC

Created attachment 879725 [details]
qemu log for the VM in question

Attaching qemu log.  Very little here, unfortunately :(

Comment 9 David Gibson 2014-04-02 00:13:22 UTC

Works for me.  I've attached the other bug to my case.

*** This bug has been marked as a duplicate of bug 985770 ***

Note You need to log in before you can comment on or make changes to this bug.