Bug 1506494
| Summary: | Use pause-before-switchover migration capability | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Jiri Denemark <jdenemar> |
| Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> |
| Status: | CLOSED ERRATA | QA Contact: | zhe peng <zpeng> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.5 | CC: | chayang, dgilbert, fjin, juzhang, knoel, michen, qizhu, qzhang, rbalakri, virt-bugs, virt-maint, xfu, xianwang, yafu, yalzhang |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | libvirt-3.9.0-1.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1497120 | Environment: | |
| Last Closed: | 2018-04-10 10:59:09 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1497120 | ||
| Bug Blocks: | |||
|
Description
Jiri Denemark
2017-10-26 08:28:41 UTC
This is now implemented upstream by
commit b5ec33b832d8e3703cddc6598314de24344999c3
Refs: v3.8.0-243-gb5ec33b83
Author: Jiri Denemark <jdenemar>
AuthorDate: Fri Oct 20 09:17:09 2017 +0200
Commit: Jiri Denemark <jdenemar>
CommitDate: Thu Oct 26 10:36:02 2017 +0200
qemu: Add support for migrate-continue QMP command
Signed-off-by: Jiri Denemark <jdenemar>
commit 6addde24bee54f2502a942b6cc4a5e61f5e6ba4a
Refs: v3.8.0-244-g6addde24b
Author: Jiri Denemark <jdenemar>
AuthorDate: Fri Oct 20 10:11:32 2017 +0200
Commit: Jiri Denemark <jdenemar>
CommitDate: Thu Oct 26 10:36:02 2017 +0200
qemu: Add pause-before-switchover migration capability
This new capability enables a pause before device state serialization so
that we can finish all block jobs without racing with the end of the
migration. The pause is indicated by "pre-switchover" state. Once we're
done QEMU enters "device" migration state.
This patch just defines the new capability and QEMU migration states and
their mapping to our job states.
Signed-off-by: Jiri Denemark <jdenemar>
commit 32c29f10db3f7e847508564d8d91848f41f43abb
Refs: v3.8.0-245-g32c29f10d
Author: Jiri Denemark <jdenemar>
AuthorDate: Fri Oct 20 10:12:21 2017 +0200
Commit: Jiri Denemark <jdenemar>
CommitDate: Thu Oct 26 10:36:02 2017 +0200
qemu: Enabled pause-before-switchover migration capability
QEMU identified a race condition between the device state serialization
and the end of storage migration. Both QEMU and libvirt needs to be
updated to fix this.
Our migration work flow is modified so that after starting the migration
we to wait for QEMU to enter "pre-switchover", "postcopy-active", or
"completed" state. Once there, we cancel all block jobs as usual. But if
QEMU is in "pre-switchover", we need to resume the migration afterwards
and wait again for the real end (either "postcopy-active" or
"completed" state).
Old QEMU will just enter either "postcopy-active" or "completed"
directly, which is still correctly handled even by new libvirt. The
"pre-switchover" state will only be entered if QEMU supports it and the
pause-before-switchover capability was enabled. Thus all combinations of
libvirt and QEMU will work, but only new QEMU with new libvirt will
avoid the race condition.
Signed-off-by: Jiri Denemark <jdenemar>
I can't reproduce this bug.
test with build:
libvirt-3.9.0-5.el7.x86_64
qemu-kvm-rhev-2.10.0-9.el7.x86_64
check the work flow:
The precopy flow is:
active->pre-switchover->device->completed
The postcopy flow is:
active->pre-switchover->postcopy-active->completed
s1: pre-copy migration
# virsh migrate --live rhel7.3 qemu+ssh://$target_host/system --verbose
Migration: [100 %]
check source libvirtd.log
2017-12-20 08:10:51.870+0000: 1712: info : qemuMonitorIOProcess:439 : QEMU_MONITOR_IO_PROCESS: mon=0x7fe65c000fb0 buf={"return": [{"state": false, "capability": "xbzrle"}, {"state": false, "capab ility": "rdma-pin-all"}, {"state": false, "capability": "auto-converge"}, {"state": false, "capability": "zero-blocks"}, {"state": false, "capability": "compress"}, {"state": true, "capability": "events"}, {"state": false, "capability": "postcopy-ram"}, {"state": false, "capability": "x-colo"}, {"state": false, "capability": "release-ram"}, {"state": false, "capability": "return-path"}, {"state": false, "capability": "pause-before-switchover"}], "id": "libvirt-2"}
-----
can see default capability "pause-before-switchover" is disable
-----
2017-12-20 08:10:51.870+0000: 1782: debug : qemuDomainCheckMigrationCapabilities:10741 : Found migration capability: 'pause-before-switchover'
-----
can found "pause-before-switchover"
-----
.....
2017-12-20 08:11:47.010+0000: 1714: debug : qemuMonitorSetMigrationCapability:3962 : capability=pause-before-switchover, state=1
2017-12-20 08:11:47.010+0000: 1714: info : qemuMonitorSend:1064 : QEMU_MONITOR_SEND_MSG: mon=0x7fe65c000fb0 msg={"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"pause-before-switchover","state":true}]},"id":"libvirt-20"}
.....
2017-12-20 08:11:53.711+0000: 1712: debug : qemuMonitorJSONIOProcessLine:193 : Line [{"timestamp": {"seconds": 1513757513, "microseconds": 711606}, "event": "MIGRATION", "data": {"status": "pre-switchover"}}]
2017-12-20 08:11:53.711+0000: 1712: debug : virJSONValueToString:1914 : result={"status":"pre-switchover"}
2017-12-20 08:11:53.711+0000: 1712: debug : qemuProcessHandleMigrationStatus:1644 : Migration of domain 0x7fe66437e190 rhel7.3 changed state to pre-switchover
-----
change to pre-switchover
-----
2017-12-20 08:11:53.712+0000: 1712: info : qemuMonitorIOProcess:439 : QEMU_MONITOR_IO_PROCESS: mon=0x7fe65c000fb0 buf={"return": {"expected-downtime": 300, "status": "pre-switchover", "setup-time ": 7, "total-time": 6691, "ram": {"total": 1183662080, "postcopy-requests": 0, "dirty-sync-count": 2, "page-size": 4096, "remaining": 8675328, "mbps": 939.060971, "transferred": 788080284, "duplicate": 97166, "dirty-pages-rate": 43584, "skipped": 0, "normal-bytes": 785670144, "normal": 191814}}, "id": "libvirt-38"}
2017-12-20 08:11:53.712+0000: 1714: debug : qemuMonitorJSONCommandWithFd:298 : Send command '{"execute":"migrate-continue","arguments":{"state":"pre-switchover"},"id":"libvirt-39"}' for write with FD -1
2017-12-20 08:11:53.712+0000: 1714: info : qemuMonitorSend:1064 : QEMU_MONITOR_SEND_MSG: mon=0x7fe65c000fb0 msg={"execute":"migrate-continue","arguments":{"state":"pre-switchover"},"id":"libvirt-39"}
s2: post-copy migration
# virsh migrate --live rhel7.3 qemu+ssh://$target_host/system --postcopy --postcopy-after-precopy --verbose
Migration: [100 %]
2017-12-20 08:56:54.821+0000: 1716: debug : qemuMonitorSetMigrationCapability:3962 : capability=postcopy-ram, state=1
4175 2017-12-20 08:56:54.821+0000: 1716: info : qemuMonitorSend:1064 : QEMU_MONITOR_SEND_MSG: mon=0x7fe6c000f740 msg={"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"postcopy-ram","state":true}]},"id":"libvirt-27"}
2017-12-20 08:56:54.821+0000: 1712: info : qemuMonitorIOWrite:544 : QEMU_MONITOR_IO_WRITE: mon=0x7fe6c000f740 buf={"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability": "postcopy-ram","state":true}]},"id":"libvirt-27"}
4320 2017-12-20 08:56:54.822+0000: 1716: debug : qemuMonitorSetMigrationCapability:3962 : capability=pause-before-switchover, state=1
2017-12-20 08:56:54.822+0000: 1716: debug : qemuMonitorJSONCommandWithFd:298 : Send command '{"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"pause-before-switchover","state":true}]},"id":"libvirt-28"}' for write with FD -1
2017-12-20 08:56:54.822+0000: 1716: info : qemuMonitorSend:1064 : QEMU_MONITOR_SEND_MSG: mon=0x7fe6c000f740 msg={"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"pause-before-switchover","state":true}]},"id":"libvirt-28"}
2017-12-20 08:56:54.823+0000: 1712: info : qemuMonitorIOWrite:544 : QEMU_MONITOR_IO_WRITE: mon=0x7fe6c000f740 buf={"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"pause-before-switchover","state":true}]},"id":"libvirt-28"}
2017-12-20 08:57:01.400+0000: 1714: debug : qemuDomainMigrateStartPostCopy:13690 : Starting post-copy
2017-12-20 08:57:01.400+0000: 1714: info : qemuMonitorSend:1064 : QEMU_MONITOR_SEND_MSG: mon=0x7fe6c000f740 msg={"execute":"migrate-start-postcopy","id":"libvirt-46"}
2017-12-20 08:57:01.457+0000: 1712: info : qemuMonitorIOProcess:439 : QEMU_MONITOR_IO_PROCESS: mon=0x7fe6c000f740 buf={"timestamp": {"seconds": 1513760221, "microseconds": 457307}, "event": "MIGRATION", "data": {"status": "pre-switchover"}}
-----
start postcopy, then hit the pre-switchover
-----
2017-12-20 08:57:01.457+0000: 1712: debug : virJSONValueToString:1914 : result={"status":"pre-switchover"}
2017-12-20 08:57:01.457+0000: 1712: debug : qemuMonitorEmitMigrationStatus:1599 : mon=0x7fe6c000f740, status=pre-switchover
2017-12-20 08:57:01.457+0000: 1712: debug : qemuProcessHandleMigrationStatus:1644 : Migration of domain 0x7fe66437e190 rhel7.3 changed state to pre-switchover
2017-12-20 08:57:01.457+0000: 1716: debug : qemuMigrationCompleted:1534 : Migration paused before switchover
2017-12-20 08:57:01.458+0000: 1712: info : qemuMonitorIOProcess:439 : QEMU_MONITOR_IO_PROCESS: mon=0x7fe6c000f740 buf={"return": {"expected-downtime": 1567, "status": "pre-switchover", "setup-time": 7, "total-time": 6625, "ram": {"total": 1183662080, "postcopy-requests": 0, "dirty-sync-count": 2, "page-size": 4096, "remaining": 68734976, "mbps": 979.37469, "transferred": 780253359, "duplicate": 100746, "dirty-pages-rate": 46863, "skipped": 0, "normal-bytes": 777826304, "normal": 189899}}, "id": "libvirt-47"}
2017-12-20 08:57:01.458+0000: 1716: debug : qemuMonitorMigrateContinue:4277 : status=pre-switchover
2017-12-20 08:57:01.458+0000: 1716: info : qemuMonitorSend:1064 : QEMU_MONITOR_SEND_MSG: mon=0x7fe6c000f740 msg={"execute":"migrate-continue","arguments":{"state":"pre-switchover"},"id":"libvirt-48"}
2017-12-20 08:57:01.458+0000: 1712: info : qemuMonitorIOWrite:544 : QEMU_MONITOR_IO_WRITE: mon=0x7fe6c000f740 buf={"execute":"migrate-continue","arguments":{"state":"pre-switchover"},"id":"libvirt-48"}
2017-12-20 08:57:01.459+0000: 1712: debug : qemuProcessHandleMigrationStatus:1644 : Migration of domain 0x7fe66437e190 rhel7.3 changed state to postcopy-active
2017-12-20 08:57:01.459+0000: 1716: debug : qemuMigrationCompleted:1544 : Migration switched to post-copy
s3: storage migration
# virsh migrate --live rhel7.3 qemu+ssh://$target_host/system --copy-storage-all --verbose
Migration: [100 %]
check source libvirtd log
2017-12-20 09:32:32.853+0000: 1715: debug : qemuMonitorDriveMirror:3344 : device=drive-virtio-disk0, file=nbd:$target_host:49153:exportname=drive-virtio-disk0, format=raw, bandwidth=9223372036853727232, granularity=0, buf_size=0, flags=0x2
2017-12-20 09:32:32.853+0000: 1715: info : qemuMonitorSend:1064 : QEMU_MONITOR_SEND_MSG: mon=0x7fe6c8011460 msg={"execute":"drive-mirror","arguments":{"device":"drive-virtio-disk0","target":"nbd:$target_host:49153:exportname=drive-virtio-disk0","speed":9223372036853727232,"sync":"full","mode":"existing","format":"raw"},"id":"libvirt-42"}
2017-12-20 09:32:32.853+0000: 1712: info : qemuMonitorIOWrite:544 : QEMU_MONITOR_IO_WRITE: mon=0x7fe6c8011460 buf={"execute":"drive-mirror","arguments":{"device":"drive-virtio-disk0","target":"nbd:$target_host:49153:exportname=drive-virtio-disk0","speed":9223372036853727232,"sync":"full","mode":"existing","format":"raw"},"id":"libvirt-42"}
2017-12-20 09:33:55.225+0000: 1715: debug : qemuMonitorSetMigrationCapability:3962 : capability=pause-before-switchover, state=1
2017-12-20 09:33:55.225+0000: 1715: info : qemuMonitorSend:1064 : QEMU_MONITOR_SEND_MSG: mon=0x7fe6c8011460 msg={"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"pause-before-switchover","state":true}]},"id":"libvirt-213"}
2017-12-20 09:33:55.225+0000: 1712: info : qemuMonitorIOWrite:544 : QEMU_MONITOR_IO_WRITE: mon=0x7fe6c8011460 buf={"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"pause-before-switchover","state":true}]},"id":"libvirt-213"}
2017-12-20 09:34:01.965+0000: 1712: debug : qemuProcessHandleMigrationStatus:1644 : Migration of domain 0x7fe66437e190 rhel7.3 changed state to pre-switchover
2017-12-20 09:34:01.965+0000: 1715: debug : qemuMigrationCompleted:1534 : Migration paused before switchover
2017-12-20 09:34:01.966+0000: 1712: info : qemuMonitorIOProcess:439 : QEMU_MONITOR_IO_PROCESS: mon=0x7fe6c8011460 buf={"return": {"expected-downtime": 300, "status": "pre-switchover", "setup-time": 7, "total-time": 6730, "ram": {"total": 1183662080, "postcopy-requests": 0, "dirty-sync-count": 2, "page-size": 4096, "remaining": 29597696, "mbps": 942.506621, "transferred": 792048355, "dupl icate": 96197, "dirty-pages-rate": 44124, "skipped": 0, "normal-bytes": 789639168, "normal": 192783}}, "id": "libvirt-232"}
2017-12-20 09:34:01.967+0000: 1712: info : qemuMonitorIOWrite:544 : QEMU_MONITOR_IO_WRITE: mon=0x7fe6c8011460 buf={"execute":"block-job-cancel","arguments":{"device":"drive-virtio-disk0"},"id":"libvirt-233"}
2017-12-20 09:34:01.985+0000: 1715: info : qemuMonitorSend:1064 : QEMU_MONITOR_SEND_MSG: mon=0x7fe6c8011460 msg={"execute":"migrate-continue","arguments":{"state":"pre-switchover"},"id":"libvirt-236"}
2017-12-20 09:34:02.249+0000: 1712: info : qemuMonitorIOProcess:439 : QEMU_MONITOR_IO_PROCESS: mon=0x7fe6c8011460 buf={"timestamp": {"seconds": 1513762442, "microseconds": 248806}, "event": "MIGRATION", "data": {"status": "completed"}}
Do some free testing, no issues found, move to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:0704 |