Bug 1386359 - Post-copy migration of non-shared storage - active mirror block job (qemu)
Summary: Post-copy migration of non-shared storage - active mirror block job (qemu)
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Hanna Czenczek
QA Contact: aihua liang
URL:
Whiteboard:
Depends On:
Blocks: 1324566 1644988
TreeView+ depends on / blocked
 
Reported: 2016-10-18 18:12 UTC by Ademar Reis
Modified: 2019-02-01 17:29 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of: 1324566
: 1644988 (view as bug list)
Environment:
Last Closed: 2019-02-01 17:29:06 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Ademar Reis 2016-10-18 18:12:20 UTC
+++ This bug was initially created as a clone of Bug #1324566 +++

Description of problem:

Post-copy migration is supposed to always converge, but when migrating a domain with non-shared storage, post-copy migration starts once all disks are migrated to the destination host. However, storage migration does not use post-copy approach, which means migrating a domain may never finish even though post-copy was requested.

Both storage and memory needs to be migrated in a post-copy way to ensure migration always converges.

Version-Release number of selected component (if applicable):

libvirt-1.3.3-1.el7

--- Additional comment from Jiri Denemark on 2016-04-06 12:52:03 BRT ---

Paolo, you seemed to have an idea what block jobs should be used by libvirt to implement post-copy storage migration. Could you describe your idea in detail?

--- Additional comment from Paolo Bonzini on 2016-04-07 10:23:34 BRT ---

Sure! It's the opposite to the current NBD flow, which runs the NBD server on the destination on drive-mirror on the source.

Here, the NBD server runs on the source, the qcow2 image is created with the NBD server as the backing file, and block-stream is used on the destination to do post-copy migration.  Unfortunately you cannot switch from pre-copy to postcopy; you have to start the postcopy phase before doing "cont" on the destination, with no previous copy.

I think this makes it less desirable than for RAM.

--- Additional comment from Dr. David Alan Gilbert on 2016-04-07 10:52:00 BRT ---

(In reply to Paolo Bonzini from comment #3)
> Sure! It's the opposite to the current NBD flow, which runs the NBD server
> on the destination on drive-mirror on the source.
> 
> Here, the NBD server runs on the source, the qcow2 image is created with the
> NBD server as the backing file, and block-stream is used on the destination
> to do post-copy migration.  Unfortunately you cannot switch from pre-copy to
> postcopy; you have to start the postcopy phase before doing "cont" on the
> destination, with no previous copy.
> 
> I think this makes it less desirable than for RAM.

I don't understand with this how you know when you can start running the destination; I also don't understand the interaction of why you can't do the precopy phase of RAM.

Anyway, isn't the easier story here just to use the existing block write throttling to set a block write bandwidth lower than the network bandwidth - unlike RAM, we've already got a throttle that should be able to limit based on what we need.

--- Additional comment from Jiri Denemark on 2016-04-07 10:56:26 BRT ---

Apparently (confirmed with Paolo on IRC), we'd have to start block-stream jobs after stopping vCPUs on the source and before starting the destination.

--- Additional comment from Jiri Denemark on 2016-04-07 10:57:59 BRT ---

That said, I think throttling disk I/O is really a better idea.

What do OpenStack guys think about it, Daniel?

--- Additional comment from Daniel Berrange on 2016-04-07 11:21:46 BRT ---

The desirable thing about post-copy is that it guarantees completion in a finite amount of time without having to do any kind of calculations wrt guest dirtying rate vs constantly changing network bandwidth availability. If you use pre-copy with bandwidth throttling you have the problem of figuring out what level of throttling you need to apply in order to ensure the guest completes in a finite predictable time - this is pretty non-trivial unless you are super conservative and apply a very strict bandwidth limit, which in turn has the problem that you're probably slowing the guest down more than it actually needs to be. THis is the prime reason OpenStack is much more enthusiastic about using post-copy, than throttling guest CPUs with cgroups or auto-converge feature of QEMU.


So from POV of ease of management having disk able to support post-copy in the same way as RAM is really very desirable for OpenStack.

--- Additional comment from Daniel Berrange on 2016-04-07 11:39:02 BRT ---

I'm thinking about the proposals on qemu-devel wrt extending the NBD server to support live backups, and how the NBD server would expose a fake block allocation bitmap to represent dirty blocks. I think that functionality could be usable as the foundation for doing combined switchable pre+post-copy for disk too.

First have the NBD server always run on the source host. Now during initial pre-copy phase, the NBD client on the target host would do one pass copying of the whole dataset. Thereafter it would loop querying the fake "block allocation bitmap" from the source, to get list of blocks which have been dirtied since the initial copy and copying those. 

When switching to post-copy, it would continue to fetch all remaining dirty blocks, but at any point it can make a request fetch a specific block being accessed right now by the VM.

--- Additional comment from Jiri Denemark on 2016-04-07 11:52:19 BRT ---

So it seems you agree that implementing storage migration using the currently available block-stream job is not something you'd want from libvirt. In that case we need to clone this BZ for QEMU requesting the new functionality.

--- Additional comment from Daniel Berrange on 2016-04-07 15:12:31 BRT ---

It would be nice if we could implement it using existing functionality, from what Paolo describes, it doesnt sound like it is possible - you can do pre-copy, or post-copy, but can't switch from pre-to-post on the fly which is what we'd need to have the ability todo to match RAM handling. So it seems we likely need new QEMU functionality.

Comment 1 xianwang 2017-03-13 03:20:10 UTC
Hi, Kevin,
As it shown, this bug is about post-copy migration and storage vm migration, because it has "FutureFeature" keyword, if we need to add test cases for it, which test plan should own this case? "migration" or "storage vm migration"?


Note You need to log in before you can comment on or make changes to this bug.