Description of problem: Migration process consists in copy guest RAM to new Host. After the copy, migration code loops back to the beginning and re-copies all pages modified after the copy. The hope is that the list of pages which must be copied shrinks with each pass over the memory range. Guests with large number of memory writes can delay migration process and even stuck it, if memory writes are faster than memory copy. Currently, vdsm does not differentiate between migration not progressing due to high guest memory changes and due to memory copy hang. Version-Release number of selected component (if applicable): vdsm-4.10.2-1.6.el6.x86_64 Expected results: Print a warning message when migration is not progressing due to high guest memory changes.
Proposed patch: http://gerrit.ovirt.org/12557
Why the current state, where the log has Migration Progress: %s seconds elapsed, %s%% of data processed ... and the % of data processed is jumping/stalling, is not enough?
No. Migration can be stalled for network load, destination host interruptions and so forth. In these cases, dataRemaining will be equal to smallest_dataRemaining. We should be able to identify when migration is not progressing due to high memory changes inside the guest (dataRemaining > smallest_dataRemaining). Btw, after some time trying to migrate in this scenario, the progress was used to go over 100% of data processed, but ffdc10c0 changed the progress to be always less than 100%. Now we can't identify the situation in any case.
Was merged as 95789edc988072202787729c3ff4e99ec95afcb6
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0886.html