Bug 1013628 - migration: generate domain security label before migration actually starts
Summary: migration: generate domain security label before migration actually starts
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.5
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: Jiri Denemark
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-09-30 13:22 UTC by David Jaša
Modified: 2013-10-03 12:23 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-10-03 12:23:25 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description David Jaša 2013-09-30 13:22:32 UTC
Description of problem:
I've had the problem on RHEV testing setup when VMs were killed after apparently successful migration because "Generating domain security label" failed (logs below). If possible, things like these should be done _before_ actual migration takes place to prevent domain getting killed on destination.

Version-Release number of selected component (if applicable):
libvirt-0.10.2-26.el6+bz1009886.x86_64

How reproducible:
I didn't reproduce on different configuration/versions and I don't have the host in the configuration in question - but the bug is severe enough IMO to deserve attention nevertheless

Steps to Reproduce:
1. migrate a domain
2.
3.

Actual results:
domain is killed after migration because "generating domain security label" fails with: "unsupported configuration: Unable to find security driver for label selinux"

Expected results:
migration fails, domain continues running on source host OR domain migrates successfully

Additional info:
log excerpt (full log will be attached):
2013-09-27 11:41:58.047+0000: 8154: debug : qemuDomainObjBeginJobInternal:813 : Starting async job: migration in
2013-09-27 11:41:58.047+0000: 8154: debug : qemuDomainObjSetJobPhase:689 : Setting 'migration in' phase to 'prepare'
2013-09-27 11:41:58.047+0000: 8154: debug : qemuProcessStart:3597 : Beginning VM startup process
2013-09-27 11:41:58.047+0000: 8154: debug : qemuProcessStart:3609 : Setting current domain def as transient
2013-09-27 11:41:58.047+0000: 8154: debug : qemuProcessStart:3635 : Preparing host devices
2013-09-27 11:41:58.047+0000: 8154: debug : qemuProcessStart:3639 : Preparing chr devices
2013-09-27 11:41:58.047+0000: 8154: debug : qemuProcessStart:3648 : Generating domain security label (if required)

These lines (above and below)

2013-09-27 11:41:58.047+0000: 8154: error : virSecurityManagerGenLabel:376 : unsupported configuration: Unable to find security driver for label selinux
2013-09-27 11:41:58.051+0000: 8154: debug : qemuProcessStop:4244 : Shutting down VM 'wxp33-dj' pid=0 flags=3
2013-09-27 11:41:58.052+0000: 8154: debug : qemuProcessKill:4142 : vm=wxp33-dj pid=0 flags=5
2013-09-27 11:41:58.052+0000: 8154: debug : qemuDomainCleanupRun:1995 : driver=0x7f8af0008df0, vm=wxp33-dj
2013-09-27 11:41:58.052+0000: 8154: debug : qemuProcessAutoDestroyRemove:4725 : vm=wxp33-dj
2013-09-27 11:41:58.052+0000: 8154: debug : qemuDriverCloseCallbackUnset:661 : vm=wxp33-dj, uuid=707f6abe-8bf7-4211-9838-0265d73a8bc0, cb=0x4b69a0
2013-09-27 11:41:58.052+0000: 8154: error : qemuRemoveCgroup:752 : internal error Unable to find cgroup for wxp33-dj
2013-09-27 11:41:58.052+0000: 8154: warning : qemuProcessStop:4403 : Failed to remove cgroup for wxp33-dj
2013-09-27 11:41:58.053+0000: 8154: debug : qemuDomainObjEndAsyncJob:955 : Stopping async job: migration in
2013-09-27 11:41:58.055+0000: 8149: debug : virConnectClose:1449 : conn=0x7f8aec001110
2013-09-27 11:41:58.056+0000: 8149: debug : qemuDriverCloseCallbackRunAll:749 : conn=0x7f8aec001110

Comment 2 Jiri Denemark 2013-09-30 15:25:40 UTC
Do you actually see anything bad happening? The logs show that everything went as expected. The security label is generated during the process of starting a new qemu to be listening to incoming migration, that is as soon as a destination host knows a domain is going to be migrated in and before the migration really starts.

Comment 3 David Jaša 2013-09-30 15:33:20 UTC
Yes, I did. The VM fully migrated and then was killed. Since the log ended around the log excerpt aftr the migration, I assumed that this is the error that leads to subsequent "VM shutdown".

Comment 4 Jiri Denemark 2013-10-03 12:20:06 UTC
So except for several failed migrations caused by "unsupported configuration: Unable to find security driver for label selinux", there are only two other cases where wxp33-dj domain was shutdown:

2013-09-27 12:51:41.208+0000: 23657: debug : virDomainDestroyFlags:2238 : dom=0x7f1228000cf0, (VM: name=wxp33-dj, uuid=707f6abe-8bf7-4211-9838-0265d73a8bc0), flags=1
2013-09-27 12:51:41.208+0000: 23657: debug : qemuProcessKill:4142 : vm=wxp33-dj pid=27462 flags=0
2013-09-27 12:51:41.208+0000: 23435: debug : qemuMonitorIOProcess:354 : QEMU_MONITOR_IO_PROCESS: mon=0x7f123c0059c0 buf={"timestamp": {"seconds": 1380286301, "microseconds":
 len=85 
2013-09-27 12:51:41.208+0000: 23435: debug : qemuMonitorEmitShutdown:988 : mon=0x7f123c0059c0
2013-09-27 12:51:41.208+0000: 23435: debug : qemuProcessHandleShutdown:658 : vm=0x7f123c00d390
2013-09-27 12:51:41.408+0000: 23657: debug : qemuDomainObjBeginJobInternal:808 : Starting job: destroy (async=none)
2013-09-27 12:51:41.429+0000: 23657: debug : qemuProcessStop:4244 : Shutting down VM 'wxp33-dj' pid=27462 flags=0

This was initiated by vdsm calling virDomainDestroyFlags():

Thread-167::DEBUG::2013-09-27 14:51:41,207::BindingXMLRPC::974::vds::(wrapper) client [10.34.73.33]::call vmDestroy with ('707f6abe-8bf7-4211-9838-0265d73a8bc0',) {}


The other case is

2013-09-27 13:36:29.978+0000: 30264: debug : virDomainMigratePerform3:6251 : dom=0x7f27a0011fa0, (VM: name=wxp33-dj, uuid=707f6abe-8bf7-4211-9838-0265d73a8bc0), xmlin=(null) cookiein=(nil), cookieinlen=0, cookieout=0x7f27b47ddaf0, cookieoutlen=0x7f27b47ddafc, dconnuri=qemu+tls://10.34.73.74/system, uri=tcp://10.34.73.74, flags=3, dname=(null), bandwidth=32
...
2013-09-27 13:36:41.541+0000: 30264: debug : qemuProcessStop:4244 : Shutting down VM 'wxp33-dj' pid=770 flags=1
...
2013-09-27 13:36:41.974+0000: 30264: debug : qemuDomainObjEndAsyncJob:955 : Stopping async job: migration out

which is the end of a successful migration.

Comment 5 Jiri Denemark 2013-10-03 12:23:25 UTC
The first case happened shortly after a successful migration from another host so that's the failure you reported by this bug (confirmed via IRC). However, it was vdsm on behalf of the management layer that killed the domain. That all said, we (David and I) agreed, this is not a bug in libvirt.


Note You need to log in before you can comment on or make changes to this bug.