Bug 1074913
Summary: | migration can not finish with 1024k 'remaining ram' left after hotunplug 4 nics | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | mazhang <mazhang> | ||||
Component: | qemu-kvm | Assignee: | Dr. David Alan Gilbert <dgilbert> | ||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 7.0 | CC: | dgilbert, hhuang, huding, jherrman, juli, juzhang, knoel, lmiksik, mazhang, michen, qiguo, qzhang, rbalakri, tdosek, virt-maint | ||||
Target Milestone: | rc | Keywords: | ZStream | ||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | qemu-kvm-1.5.3-63.el7 | Doc Type: | Bug Fix | ||||
Doc Text: |
Previously, the QEMU migration code did not account for the gaps caused by hot unplugged devices and thus expected more memory to be transferred during migrations. As a consequence, a guest migration failed to complete after multiple devices were hot unplugged. In addition, the migration info text afterwards displayed erroneous values for the "remaining ram" item. With this update, QEMU calculates memory after a device has been unplugged correctly, and any subsequent guest migrations proceed as expected.
|
Story Points: | --- | ||||
Clone Of: | |||||||
: | 1110189 (view as bug list) | Environment: | |||||
Last Closed: | 2015-03-05 08:04:55 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1076185, 1110189 | ||||||
Attachments: |
|
Description
mazhang
2014-03-11 08:51:41 UTC
Please specify the QEMU command line on the destination. Does it include the 4 NICs or not?
What happens with 1 NIC unplugged?
Can you re-explain this comment? It is not clear:
> Additional info:
> Less 4 nics hotunpluged not hit this problem.
> RHEL7 guest works well.
Does this mean migrating with the 4 NICs (not unplugged) works?
Thanks.
1 With 1, 2, 3 nics unplugged works well. 2 RHEL7 guest 4 nics unplugged migration works well. (In reply to Karen Noel from comment #1) > Please specify the QEMU command line on the destination. Does it include the > 4 NICs or not? Destination command line: /usr/libexec/qemu-kvm \ -M pc \ -cpu SandyBridge,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff \ -m 2G \ -smp 4,sockets=2,cores=2,threads=1,maxcpus=16 \ -enable-kvm \ -name win8 \ -uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \ -smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 \ -k en-us \ -rtc base=localtime,clock=host,driftfix=slew \ -nodefaults \ -monitor stdio \ -qmp tcp:localhost:6666,server,nowait \ -boot menu=on \ -bios /usr/share/seabios/bios.bin \ -vga cirrus \ -vnc :0 \ -drive file=/home/win8-32.raw,if=none,id=drive-virtio-disk0,format=raw,cache=none,werror=stop,rerror=stop,aio=threads \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=0 \ -incoming tcp:0:5800 \ Does this only happen with Windows guests or does it also happen with RHEL7 guests? If you start off with 5 NICs: a) Does it fail if you remove 4? b) Does it fail if you remove 5? Dave (In reply to Dr. David Alan Gilbert from comment #4) > Does this only happen with Windows guests or does it also happen with RHEL7 > guests? > RHEL7 guest hit this problem. > If you start off with 5 NICs: > a) Does it fail if you remove 4? This scenario failed. > b) Does it fail if you remove 5? > Also failed, and the "remaining ram" change to "1280 kbytes" (qemu) info migrate capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off Migration status: active total time: 58352 milliseconds expected downtime: 30 milliseconds setup: 1 milliseconds transferred ram: 718901 kbytes throughput: 17.63 mbps remaining ram: 1280 kbytes total ram: 2113872 kbytes duplicate: 362411 pages skipped: 0 pages normal: 166057 pages normal bytes: 664228 kbytes (qemu) info migrate capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off Migration status: active total time: 59160 milliseconds expected downtime: 30 milliseconds setup: 1 milliseconds transferred ram: 719970 kbytes throughput: 17.60 mbps remaining ram: 1280 kbytes total ram: 2113872 kbytes duplicate: 362411 pages skipped: 0 pages normal: 166057 pages normal bytes: 664228 kbytes > Dave Start vm with 6 nics, then remove 6 nics, do migration, also failed. (qemu) info migrate capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off Migration status: active total time: 106287 milliseconds expected downtime: 30 milliseconds setup: 1 milliseconds transferred ram: 728659 kbytes throughput: 17.58 mbps remaining ram: 1536 kbytes <----- remaining ram increased. total ram: 2113872 kbytes duplicate: 376342 pages skipped: 0 pages normal: 152126 pages normal bytes: 608504 kbytes (qemu) info migrate capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off Migration status: active total time: 106863 milliseconds expected downtime: 30 milliseconds setup: 1 milliseconds transferred ram: 729419 kbytes throughput: 17.58 mbps remaining ram: 1536 kbytes total ram: 2113872 kbytes duplicate: 376342 pages skipped: 0 pages normal: 152126 pages normal bytes: 608504 kbytes OK, so it sounds like we're losing 256KB for each unplugged NIC. I'll have a look. Dave Created attachment 876802 [details]
Script that triggers this bug
The attached script triggers it on RHEL7 with either our qemu or upstream 2.0.0-rc0
Fix posted upstream providing pm_ack+ based on GSS approval Fix included in qemu-kvm-1.5.3-63.el7 Verify this bug using the following version: qemu-kvm-1.5.3-64.el7.x86_64 kernel-3.10.0-128.el7.x86_64 Steps to Verify: 1. boot a win8-32 guest on src host /usr/libexec/qemu-kvm \ -M pc \ -cpu SandyBridge,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff \ -m 2G \ -smp 4,sockets=2,cores=2,threads=1,maxcpus=16 \ -enable-kvm \ -name win8 \ -uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \ -smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 \ -k en-us \ -rtc base=localtime,clock=host,driftfix=slew \ -nodefaults \ -monitor stdio \ -qmp tcp:localhost:6666,server,nowait \ -boot menu=on \ -bios /usr/share/seabios/bios.bin \ -vga cirrus \ -vnc :0 \ -drive file=/home/win8-32-virtio.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=threads \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=0 \ -netdev tap,id=hostnet0,vhost=on,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown,ifname=guest0 \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:13:10:10,bus=pci.0,addr=0x4 \ -netdev tap,id=hostnet1,vhost=on,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown,ifname=guest1 \ -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:13:10:11,bus=pci.0,addr=0x5 \ -netdev tap,id=hostnet2,vhost=on,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown,ifname=guest2 \ -device virtio-net-pci,netdev=hostnet2,id=net2,mac=52:54:00:13:10:12,bus=pci.0,addr=0x6 \ -netdev tap,id=hostnet3,vhost=on,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown,ifname=guest3 \ -device virtio-net-pci,netdev=hostnet3,id=net3,mac=52:54:00:13:10:13,bus=pci.0,addr=0x7 \ 2. unhotplug four nics # telnet localhost 6666 Trying ::1... Connected to localhost. Escape character is '^]'. {"QMP": {"version": {"qemu": {"micro": 3, "minor": 5, "major": 1}, "package": " (qemu-kvm-1.5.3-60.el7)"}, "capabilities": []}} {"execute":"qmp_capabilities"} {"return": {}} {"execute": "device_del", "arguments": {"id": "net0"}} {"return": {}} {"timestamp": {"seconds": 1403250008, "microseconds": 416150}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/net0/virtio-backend"}} {"timestamp": {"seconds": 1403250008, "microseconds": 416302}, "event": "DEVICE_DELETED", "data": {"device": "net0", "path": "/machine/peripheral/net0"}} {"execute": "device_del", "arguments": {"id": "net1"}} {"return": {}} {"timestamp": {"seconds": 1403250012, "microseconds": 251351}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/net1/virtio-backend"}} {"timestamp": {"seconds": 1403250012, "microseconds": 251507}, "event": "DEVICE_DELETED", "data": {"device": "net1", "path": "/machine/peripheral/net1"}} {"execute": "device_del", "arguments": {"id": "net2"}} {"return": {}} {"timestamp": {"seconds": 1403250017, "microseconds": 624422}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/net2/virtio-backend"}} {"timestamp": {"seconds": 1403250017, "microseconds": 624588}, "event": "DEVICE_DELETED", "data": {"device": "net2", "path": "/machine/peripheral/net2"}} {"execute": "device_del", "arguments": {"id": "net3"}} {"return": {}} {"timestamp": {"seconds": 1403250021, "microseconds": 721016}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/net3/virtio-backend"}} {"timestamp": {"seconds": 1403250021, "microseconds": 721165}, "event": "DEVICE_DELETED", "data": {"device": "net3", "path": "/machine/peripheral/net3"}} {"execute": "netdev_del", "arguments": {"id": "hostnet0"}} {"return": {}} {"execute": "netdev_del", "arguments": {"id": "hostnet1"}} {"return": {}} {"execute": "netdev_del", "arguments": {"id": "hostnet2"}} {"return": {}} {"execute": "netdev_del", "arguments": {"id": "hostnet3"}} {"return": {}} 3. do migration (qemu) migrate -d tcp:10.66.9.152:5800 4. check migration status in src qemu-kvm Actual results: migration can finish normally, remaining ram is 0k. (qemu) info migrate capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off Migration status: completed total time: 57862 milliseconds downtime: 32 milliseconds setup: 10 milliseconds transferred ram: 663613 kbytes throughput: 98.01 mbps remaining ram: 0 kbytes total ram: 2113872 kbytes duplicate: 452422 pages skipped: 0 pages normal: 164586 pages normal bytes: 658344 kbytes (qemu) info status VM status: paused (postmigrate) Additional info: I also test the following scenarios, migration can finish normally, remaining ram is 0k. 1). boot win8-32 guest, unhotplug 5/6 nics then do migration 2). boot rhel7 guest, unhotplug 4/5/6 nics then do migration Test this bug on an amd host using the following version: qemu-kvm-rhev-2.1.0-3.el7ev.preview.x86_64 kernel-3.10.0-142.el7.x86_64 Steps to Test: 1. boot a win8-32 guest on src host /usr/libexec/qemu-kvm \ -M pc \ -cpu Opteron_G2,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff \ -m 2G \ -smp 4,sockets=2,cores=2,threads=1,maxcpus=16 \ -enable-kvm \ -name win8 \ -uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \ -smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 \ -k en-us \ -rtc base=localtime,clock=host,driftfix=slew \ -nodefaults \ -monitor stdio \ -qmp tcp:localhost:6666,server,nowait \ -boot menu=on \ -bios /usr/share/seabios/bios.bin \ -vga cirrus \ -vnc :0 \ -drive file=/mnt/win8-32-virtio.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=writethrough,werror=stop,rerror=stop,aio=threads \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=0 \ -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:13:10:10,bus=pci.0,addr=0x4 \ -netdev tap,id=hostnet1,vhost=on,script=/etc/qemu-ifup \ -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:13:10:11,bus=pci.0,addr=0x5 \ -netdev tap,id=hostnet2,vhost=on,script=/etc/qemu-ifup \ -device virtio-net-pci,netdev=hostnet2,id=net2,mac=52:54:00:13:10:12,bus=pci.0,addr=0x6 \ -netdev tap,id=hostnet3,vhost=on,script=/etc/qemu-ifup \ -device virtio-net-pci,netdev=hostnet3,id=net3,mac=52:54:00:13:10:13,bus=pci.0,addr=0x7 \ 2. unhotplug four nics # telnet localhost 6666 Trying ::1... Connected to localhost. Escape character is '^]'. {"QMP": {"version": {"qemu": {"micro": 3, "minor": 5, "major": 1}, "package": " (qemu-kvm-1.5.3-60.el7)"}, "capabilities": []}} {"execute":"qmp_capabilities"} {"return": {}} {"execute": "device_del", "arguments": {"id": "net0"}} {"return": {}} {"timestamp": {"seconds": 1403250008, "microseconds": 416150}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/net0/virtio-backend"}} {"timestamp": {"seconds": 1403250008, "microseconds": 416302}, "event": "DEVICE_DELETED", "data": {"device": "net0", "path": "/machine/peripheral/net0"}} {"execute": "device_del", "arguments": {"id": "net1"}} {"return": {}} {"timestamp": {"seconds": 1403250012, "microseconds": 251351}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/net1/virtio-backend"}} {"timestamp": {"seconds": 1403250012, "microseconds": 251507}, "event": "DEVICE_DELETED", "data": {"device": "net1", "path": "/machine/peripheral/net1"}} {"execute": "device_del", "arguments": {"id": "net2"}} {"return": {}} {"timestamp": {"seconds": 1403250017, "microseconds": 624422}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/net2/virtio-backend"}} {"timestamp": {"seconds": 1403250017, "microseconds": 624588}, "event": "DEVICE_DELETED", "data": {"device": "net2", "path": "/machine/peripheral/net2"}} {"execute": "device_del", "arguments": {"id": "net3"}} {"return": {}} {"timestamp": {"seconds": 1403250021, "microseconds": 721016}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/net3/virtio-backend"}} {"timestamp": {"seconds": 1403250021, "microseconds": 721165}, "event": "DEVICE_DELETED", "data": {"device": "net3", "path": "/machine/peripheral/net3"}} {"execute": "netdev_del", "arguments": {"id": "hostnet0"}} {"return": {}} {"execute": "netdev_del", "arguments": {"id": "hostnet1"}} {"return": {}} {"execute": "netdev_del", "arguments": {"id": "hostnet2"}} {"return": {}} {"execute": "netdev_del", "arguments": {"id": "hostnet3"}} {"return": {}} 3. do migration (qemu) migrate -d tcp:10.66.9.152:5800 4. check migration status in src qemu-kvm Actual results: migration can finish normally, remaining ram is 0k. (qemu) info migrate capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off Migration status: completed total time: 20641 milliseconds downtime: 3 milliseconds setup: 22 milliseconds transferred ram: 668477 kbytes throughput: 268.57 mbps remaining ram: 0 kbytes total ram: 2113872 kbytes duplicate: 415898 pages skipped: 0 pages normal: 165881 pages normal bytes: 663524 kbytes Additional info: I also test the following scenario, migration can finish normally, remaining ram is 0k. 1). boot win8-32 guest, unhotplug 5/6 nics then do migration Test this bug on an intel host using the following version: qemu-kvm-rhev-2.1.0-3.el7ev.preview.x86_64 kernel-3.10.0-140.el7.x86_64 Steps to Test: 1. boot a rhel7.1 guest on src host /usr/libexec/qemu-kvm \ -M pc \ -cpu SandyBridge \ -m 2G \ -smp 4,sockets=2,cores=2,threads=1,maxcpus=16 \ -enable-kvm \ -name win8 \ -uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \ -smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 \ -k en-us \ -rtc base=localtime,clock=host,driftfix=slew \ -nodefaults \ -monitor stdio \ -qmp tcp:localhost:6666,server,nowait \ -boot menu=on \ -bios /usr/share/seabios/bios.bin \ -vga cirrus \ -vnc :0 \ -drive file=/mnt/rhel7_1.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=writethrough,werror=stop,rerror=stop,aio=threads \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=0 \ -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:13:10:10,bus=pci.0,addr=0x4 \ -netdev tap,id=hostnet1,vhost=on,script=/etc/qemu-ifup \ -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:13:10:11,bus=pci.0,addr=0x5 \ -netdev tap,id=hostnet2,vhost=on,script=/etc/qemu-ifup \ -device virtio-net-pci,netdev=hostnet2,id=net2,mac=52:54:00:13:10:12,bus=pci.0,addr=0x6 \ -netdev tap,id=hostnet3,vhost=on,script=/etc/qemu-ifup \ -device virtio-net-pci,netdev=hostnet3,id=net3,mac=52:54:00:13:10:13,bus=pci.0,addr=0x7 \ 2. unhotplug four nics # telnet localhost 6666 Trying ::1... Connected to localhost. Escape character is '^]'. {"QMP": {"version": {"qemu": {"micro": 3, "minor": 5, "major": 1}, "package": " (qemu-kvm-1.5.3-60.el7)"}, "capabilities": []}} {"execute":"qmp_capabilities"} {"return": {}} {"execute": "device_del", "arguments": {"id": "net0"}} {"return": {}} {"timestamp": {"seconds": 1403250008, "microseconds": 416150}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/net0/virtio-backend"}} {"timestamp": {"seconds": 1403250008, "microseconds": 416302}, "event": "DEVICE_DELETED", "data": {"device": "net0", "path": "/machine/peripheral/net0"}} {"execute": "device_del", "arguments": {"id": "net1"}} {"return": {}} {"timestamp": {"seconds": 1403250012, "microseconds": 251351}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/net1/virtio-backend"}} {"timestamp": {"seconds": 1403250012, "microseconds": 251507}, "event": "DEVICE_DELETED", "data": {"device": "net1", "path": "/machine/peripheral/net1"}} {"execute": "device_del", "arguments": {"id": "net2"}} {"return": {}} {"timestamp": {"seconds": 1403250017, "microseconds": 624422}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/net2/virtio-backend"}} {"timestamp": {"seconds": 1403250017, "microseconds": 624588}, "event": "DEVICE_DELETED", "data": {"device": "net2", "path": "/machine/peripheral/net2"}} {"execute": "device_del", "arguments": {"id": "net3"}} {"return": {}} {"timestamp": {"seconds": 1403250021, "microseconds": 721016}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/net3/virtio-backend"}} {"timestamp": {"seconds": 1403250021, "microseconds": 721165}, "event": "DEVICE_DELETED", "data": {"device": "net3", "path": "/machine/peripheral/net3"}} {"execute": "netdev_del", "arguments": {"id": "hostnet0"}} {"return": {}} {"execute": "netdev_del", "arguments": {"id": "hostnet1"}} {"return": {}} {"execute": "netdev_del", "arguments": {"id": "hostnet2"}} {"return": {}} {"execute": "netdev_del", "arguments": {"id": "hostnet3"}} {"return": {}} 3. do migration (qemu) migrate -d tcp:10.66.9.152:5800 4. check migration status in src qemu-kvm Actual results: migration can finish normally, remaining ram is 0k. (qemu) info migrate capabilities: xbzrle: off x-rdma-pin-all: off auto-converge: off zero-blocks: off Migration status: completed total time: 20641 milliseconds downtime: 3 milliseconds setup: 22 milliseconds transferred ram: 668477 kbytes throughput: 268.57 mbps remaining ram: 0 kbytes total ram: 2113872 kbytes duplicate: 415898 pages skipped: 0 pages normal: 165881 pages normal bytes: 663524 kbytes Additional info: I also test the following scenario, migration can finish normally, remaining ram is 0k. 1). boot rhel7.1 guest, unhotplug 5/6 nics then do migration Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0349.html |