Bug 1324566 - Post-copy migration of non-shared storage (libvirt)
Summary: Post-copy migration of non-shared storage (libvirt)
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: libvirt
Version: ---
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: rc
: ---
Assignee: Virtualization Maintenance
QA Contact: Fangge Jin
URL:
Whiteboard:
Depends On: 1386359 1644988
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-04-06 15:48 UTC by Jiri Denemark
Modified: 2021-10-05 08:33 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
: 1386359 (view as bug list)
Environment:
Last Closed: 2021-06-15 07:31:00 UTC
Type: Feature Request
Target Upstream Version:


Attachments (Terms of Use)

Description Jiri Denemark 2016-04-06 15:48:47 UTC
Description of problem:

Post-copy migration is supposed to always converge, but when migrating a domain with non-shared storage, post-copy migration starts once all disks are migrated to the destination host. However, storage migration does not use post-copy approach, which means migrating a domain may never finish even though post-copy was requested.

Both storage and memory needs to be migrated in a post-copy way to ensure migration always converges.

Version-Release number of selected component (if applicable):

libvirt-1.3.3-1.el7

Comment 1 Jiri Denemark 2016-04-06 15:52:03 UTC
Paolo, you seemed to have an idea what block jobs should be used by libvirt to implement post-copy storage migration. Could you describe your idea in detail?

Comment 3 Paolo Bonzini 2016-04-07 13:23:34 UTC
Sure! It's the opposite to the current NBD flow, which runs the NBD server on the destination on drive-mirror on the source.

Here, the NBD server runs on the source, the qcow2 image is created with the NBD server as the backing file, and block-stream is used on the destination to do post-copy migration.  Unfortunately you cannot switch from pre-copy to postcopy; you have to start the postcopy phase before doing "cont" on the destination, with no previous copy.

I think this makes it less desirable than for RAM.

Comment 4 Dr. David Alan Gilbert 2016-04-07 13:52:00 UTC
(In reply to Paolo Bonzini from comment #3)
> Sure! It's the opposite to the current NBD flow, which runs the NBD server
> on the destination on drive-mirror on the source.
> 
> Here, the NBD server runs on the source, the qcow2 image is created with the
> NBD server as the backing file, and block-stream is used on the destination
> to do post-copy migration.  Unfortunately you cannot switch from pre-copy to
> postcopy; you have to start the postcopy phase before doing "cont" on the
> destination, with no previous copy.
> 
> I think this makes it less desirable than for RAM.

I don't understand with this how you know when you can start running the destination; I also don't understand the interaction of why you can't do the precopy phase of RAM.

Anyway, isn't the easier story here just to use the existing block write throttling to set a block write bandwidth lower than the network bandwidth - unlike RAM, we've already got a throttle that should be able to limit based on what we need.

Comment 5 Jiri Denemark 2016-04-07 13:56:26 UTC
Apparently (confirmed with Paolo on IRC), we'd have to start block-stream jobs after stopping vCPUs on the source and before starting the destination.

Comment 6 Jiri Denemark 2016-04-07 13:57:59 UTC
That said, I think throttling disk I/O is really a better idea.

What do OpenStack guys think about it, Daniel?

Comment 7 Daniel Berrangé 2016-04-07 14:21:46 UTC
The desirable thing about post-copy is that it guarantees completion in a finite amount of time without having to do any kind of calculations wrt guest dirtying rate vs constantly changing network bandwidth availability. If you use pre-copy with bandwidth throttling you have the problem of figuring out what level of throttling you need to apply in order to ensure the guest completes in a finite predictable time - this is pretty non-trivial unless you are super conservative and apply a very strict bandwidth limit, which in turn has the problem that you're probably slowing the guest down more than it actually needs to be. THis is the prime reason OpenStack is much more enthusiastic about using post-copy, than throttling guest CPUs with cgroups or auto-converge feature of QEMU.


So from POV of ease of management having disk able to support post-copy in the same way as RAM is really very desirable for OpenStack.

Comment 8 Daniel Berrangé 2016-04-07 14:39:02 UTC
I'm thinking about the proposals on qemu-devel wrt extending the NBD server to support live backups, and how the NBD server would expose a fake block allocation bitmap to represent dirty blocks. I think that functionality could be usable as the foundation for doing combined switchable pre+post-copy for disk too.

First have the NBD server always run on the source host. Now during initial pre-copy phase, the NBD client on the target host would do one pass copying of the whole dataset. Thereafter it would loop querying the fake "block allocation bitmap" from the source, to get list of blocks which have been dirtied since the initial copy and copying those. 

When switching to post-copy, it would continue to fetch all remaining dirty blocks, but at any point it can make a request fetch a specific block being accessed right now by the VM.

Comment 9 Jiri Denemark 2016-04-07 14:52:19 UTC
So it seems you agree that implementing storage migration using the currently available block-stream job is not something you'd want from libvirt. In that case we need to clone this BZ for QEMU requesting the new functionality.

Comment 10 Daniel Berrangé 2016-04-07 18:12:31 UTC
It would be nice if we could implement it using existing functionality, from what Paolo describes, it doesnt sound like it is possible - you can do pre-copy, or post-copy, but can't switch from pre-to-post on the fly which is what we'd need to have the ability todo to match RAM handling. So it seems we likely need new QEMU functionality.

Comment 13 Dr. David Alan Gilbert 2020-05-05 10:01:43 UTC
Given 1644988 - the qemu active mirror block job code - is now done, it would be good to get this one going.

Comment 16 RHEL Program Management 2020-12-15 07:40:50 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 19 RHEL Program Management 2021-06-15 07:31:00 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.


Note You need to log in before you can comment on or make changes to this bug.