1425273 – [Q35] migration failed after hotplug e1000e device

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1425273 - [Q35] migration failed after hotplug e1000e device

Summary: [Q35] migration failed after hotplug e1000e device

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	qemu-kvm-rhev
Sub Component:
Version:	7.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Paolo Bonzini
QA Contact:	jingzhao
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-02-21 03:09 UTC by jingzhao
Modified:	2017-08-02 03:37 UTC (History)
CC List:	12 users (show)
Fixed In Version:	qemu-kvm-rhev-2.9.0-1.el7
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-08-01 23:44:45 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2017:2392	0	normal	SHIPPED_LIVE	Important: qemu-kvm-rhev security, bug fix, and enhancement update	2017-08-01 20:04:36 UTC

Description jingzhao 2017-02-21 03:09:06 UTC

Description of problem:
[Q35] migration failed after unplug e1000e device 

Version-Release number of selected component (if applicable):
kernel-3.10.0-563.el7.x86_64
qemu-kvm-rhev-2.8.0-4.el7.x86_64

How reproducible:
3/3

Steps to Reproduce:
1.Boot guest with e1000e device [1]
(qemu) info network 
net1: index=0,type=nic,model=e1000e,macaddr=9a:6a:6b:6c:6d:6a
 \ dev1: index=0,type=tap,ifname=tap0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown

2.Unplug e1000e device and then migrate
(qemu) device_del net1 
netdev_add  netdev_del  
(qemu) netdev_del dev1 
(qemu) info network 
(qemu) migrate -d tcp:10.66.6.246:5800

Actual results:
migrated failed
src status:
(qemu) info migrate
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off 
Migration status: failed
total time: 0 milliseconds

dest status:
(qemu) qemu-kvm: Unknown ramblock "", cannot accept migration
qemu-kvm: error while loading state for instance 0x0 of device 'ram'
qemu-kvm: load of migration failed: Invalid argument


Expected results:
migrate successfully

Additional info:
Hot-plug e1000e and then migrated, the same result with above

Also, didn't reproduce the issue on pc machine type with e1000 device (test steps are same with above)

[1]
src command:
/usr/libexec/qemu-kvm \
-M q35 \
-cpu SandyBridge \
-m 4G \
-smp 4,sockets=2,cores=2,threads=1 \
-enable-kvm \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-chardev file,path=/home/seabios.log,id=seabios \
-vga qxl \
-spice port=5931,disable-ticketing \
-qmp tcp:0:4445,server,nowait \
-device ioh3420,id=root.0,slot=1 \
-device e1000e,netdev=dev1,mac=9a:6a:6b:6c:6d:6a,id=net1,bus=root.0 \
-netdev tap,id=dev1,vhost=on \
-device ioh3420,id=root.1,slot=2 \
-drive file=/home/test/q35-seabios.qcow2,if=none,id=drive0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=threads \
-device virtio-blk-pci,id=virtio-disk0,drive=drive0,bus=root.1,bootindex=0 \
-monitor stdio \

dest command:
/usr/libexec/qemu-kvm \
-M q35 \
-cpu SandyBridge \
-m 4G \
-smp 4,sockets=2,cores=2,threads=1 \
-enable-kvm \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-chardev file,path=/home/seabios1.log,id=seabios \
-vga qxl \
-spice port=5932,disable-ticketing \
-qmp tcp:0:4446,server,nowait \
-device ioh3420,id=root.0,slot=1 \
-device ioh3420,id=root.1,slot=2 \
-drive file=/home/test/q35-seabios.qcow2,if=none,id=drive0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=threads \
-device virtio-blk-pci,id=virtio-disk0,drive=drive0,bus=root.1,bootindex=0 \
-monitor stdio \
-incoming tcp:0:5800 \


Following is the pc command:
/usr/libexec/qemu-kvm \
-M pc \
-cpu SandyBridge \
-m 4G \
-smp 4,sockets=2,cores=2,threads=1 \
-enable-kvm \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-chardev file,path=/home/seabios.log,id=seabios \
-vga qxl \
-spice port=5931,disable-ticketing \
-qmp tcp:0:4445,server,nowait \
-device e1000,netdev=dev1,mac=9a:6a:6b:6c:6d:6a,id=net1 \
-netdev tap,id=dev1,vhost=on \
-drive file=/home/test/q35-seabios.qcow2,if=none,id=drive0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=threads \
-device virtio-blk-pci,id=virtio-disk0,drive=drive0,bootindex=0 \
-monitor stdio \

Comment 1 jingzhao 2017-02-21 03:14:09 UTC

Also checked q35 machine type with block device, didn't reproduce the issue

1.Boot guest with block device
2.unplug block device through hmp
(qemu) device_del virtio-disk1
3.do the local migration
4.migrated successfully

[1]/usr/libexec/qemu-kvm \
-M q35 \
-cpu SandyBridge \
-m 4G \
-smp 4,sockets=2,cores=2,threads=1 \
-enable-kvm \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-chardev file,path=/home/seabios.log,id=seabios \
-vga qxl \
-spice port=5931,disable-ticketing \
-qmp tcp:0:4445,server,nowait \
-device ioh3420,id=root.0,slot=1 \
-drive file=/home/test/block1.qcow2,if=none,id=drive1,format=qcow2,cache=none,werror=stop,rerror=stop,aio=threads \
-device virtio-blk-pci,id=virtio-disk1,drive=drive1,bus=root.0 \
-device e1000e,netdev=dev1,mac=9a:6a:6b:6c:6d:6a,id=net1 \
-netdev tap,id=dev1,vhost=on \
-device ioh3420,id=root.1,slot=2 \
-drive file=/home/test/q35-seabios.qcow2,if=none,id=drive0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=threads \
-device virtio-blk-pci,id=virtio-disk0,drive=drive0,bus=root.1,bootindex=0 \
-monitor stdio \

Thanks
Jing Zhao

Comment 2 jinchen 2017-02-21 04:31:17 UTC

with the version of qemu-kvm-rhev-2.6.0-28.el7_3.6.x86_64

migration failed after unplug virtio-net-pci
migration failed after plug virtio-net-pci

Comment 3 jinchen 2017-02-21 06:52:54 UTC

with the version of qemu-kvm-rhev-debuginfo-2.8.0-5.el7.x86_64

migration failed after unplug virtio-net-pci deivce
migration failed after plug virtio-net-pci device

Comment 5 Marcel Apfelbaum 2017-02-26 14:55:25 UTC

Hi,

Can you please provide more information:
When you removed the e1000e device, did you have another NIC so the migration process can use?
When you hot-plugged the e1000e device, was it the only NIC in the system?
Can you provide the exact steps the failed migration for virtio-net-pci device?
Did you check PC, Q35 or both for virtio-net-pci?

Thanks,
Marcel

Comment 6 jinchen 2017-03-02 03:27:39 UTC

(In reply to Marcel Apfelbaum from comment #5)
> Hi,
> 
> Can you please provide more information:
> When you removed the e1000e device, did you have another NIC so the
> migration process can use?

  I only use one NIC,but when i tried to use two NIC ,the results is still failed
> When you hot-plugged the e1000e device, was it the only NIC in the system?

  Yes,it was the only NIC in the system,if it has a NIC before hot plugged in the system,the results is successful
> Can you provide the exact steps the failed migration for virtio-net-pci
> device?

1 Boot guest with virtio-net-pci device [1]
2 Hot plug virtio-net-pci device and then migrate
(qemu) info network 
hub 0
 \ hub0port1: user.0: index=0,type=user,net=10.0.2.0,restrict=off
 \ hub0port0: e1000.0: index=0,type=nic,model=e1000,macaddr=52:54:00:12:34:56
(qemu) netdev_add tap,vhost=on,id=dev1
(qemu) device_add virtio-net-pci,netdev=dev1,id=net1,mac=9a:6a:6b:6c:6d:6a,bus=root.0
(qemu) info network 
hub 0
 \ hub0port1: user.0: index=0,type=user,net=10.0.2.0,restrict=off
 \ hub0port0: e1000.0: index=0,type=nic,model=e1000,macaddr=52:54:00:12:34:56
net1: index=0,type=nic,model=virtio-net-pci,macaddr=9a:6a:6b:6c:6d:6a
 \ dev1: index=0,type=tap,ifname=tap1,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
(qemu) migrate -d tcp:10.66.4.211:5800
(qemu) info migrate
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off 
Migration status: failed
total time: 0 milliseconds

dest status:
(qemu) info network 
net1: index=0,type=nic,model=virtio-net-pci,macaddr=9a:6a:6b:6c:6d:6a
 \ dev1: index=0,type=tap,ifname=tap0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
(qemu) qemu-kvm: Unknown ramblock "0000:00:02.0/e1000.rom", cannot accept migration
qemu-kvm: error while loading state for instance 0x0 of device 'ram'
qemu-kvm: load of migration failed: Invalid argument
red_channel_client_disconnect: rcc=0x7f3d2efae000 (channel=0x7f3d2da5c600 type=2 id=0)
red_channel_client_disconnect: rcc=0x7f3d2dcd0000 (channel=0x7f3d2da4e580 type=4 id=0)

[1]
src command:
/usr/libexec/qemu-kvm \
-M q35 \
-cpu SandyBridge \
-m 4G \
-smp 4,sockets=2,cores=2,threads=1 \
-enable-kvm \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-chardev file,path=/home/seabios.log,id=seabios \
-vga qxl \
-spice port=5931,disable-ticketing \
-qmp tcp:0:4445,server,nowait \
-device ioh3420,id=root.0,slot=1 \
-device ioh3420,id=root.1,slot=2 \
-drive file=/home/demo/1.img-seabios,if=none,id=drive0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=threads \
-device virtio-blk-pci,id=virtio-disk0,drive=drive0,bus=root.1,bootindex=0 \
-monitor stdio \

dest command:
/usr/libexec/qemu-kvm \
-M q35 \
-cpu SandyBridge \
-m 4G \
-smp 4,sockets=2,cores=2,threads=1 \
-enable-kvm \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-chardev file,path=/home/seabios1.log,id=seabios \
-vga qxl \
-spice port=5932,disable-ticketing \
-qmp tcp:0:4446,server,nowait \
-device ioh3420,id=root.0,slot=1 \
-device virtio-net-pci,netdev=dev1,mac=9a:6a:6b:6c:6d:6a,id=net1,bus=root.0 \
-netdev tap,id=dev1,vhost=on \
-device ioh3420,id=root.1,slot=2 \
-drive file=/home/demo/1.img-seabios,if=none,id=drive0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=threads \
-device virtio-blk-pci,id=virtio-disk0,drive=drive0,bus=root.1,bootindex=0 \
-monitor stdio \
-incoming tcp:0:5800 \

> Did you check PC, Q35 or both for virtio-net-pci?

  Yes.for PC:hot plugged/unplugged e1000e device or virtio-net-pci device,and no matter how many NIC in the system,the results is failed.
      for Q35:hot plugged e1000e device or virtio-net-pci device and it is not the only NIC in the system,the results is successful.hot unplugged a virtio-net-pci device and it has two NIC in the system,the results is still successful.
> Thanks,
> Marcel

Thanks,
jinchen

Comment 7 jinchen 2017-03-02 07:47:33 UTC

Hi,Marcel

  Sorry,due to my mistakes,for the PC,hot plugged/unplugged e1000 device rather than e1000e device,but the results is right,hot plugged/unplugged is failed.

Thanks,
jinchen

Comment 8 Marcel Apfelbaum 2017-03-07 14:48:57 UTC

(In reply to jinchen from comment #7)
> Hi,Marcel
> 
>   Sorry,due to my mistakes,for the PC,hot plugged/unplugged e1000 device
> rather than e1000e device,but the results is right,hot plugged/unplugged is
> failed.
> 
> Thanks,
> jinchen

Thank you for the reply. I am having trouble understanding the results.
Please try to fill the following table:

machine - operation | virtio-nic | e1000   | e1000e  |  virtio block
-------------------------------------------------------------------------
PC - hotplug        |   ok/fail  | ok/fail |  -----  |     ok/fail  
-------------------------------------------------------------------------
PC - hot-unplug     |   ok/fail  | ok/fail |  ----   |     ok/fail 
-------------------------------------------------------------------------
Q35 - hotplug       |   ok/fail  |  -----  | ok/fail |     ok/fail  
-------------------------------------------------------------------------
Q35 - hot-unplug    |   ok/fail  |  ----   | ok/fail |     ok/fail 

Thanks,
Marcel

Comment 9 Marcel Apfelbaum 2017-03-07 14:55:33 UTC

Hi David,

Can you please have a look to the migration command line
and see if the migration parameters are correct with respect
to hot-plug/hot-unplug operations? (e.g what the destination side command line should be if we hot-plug/hot-unplug a device before migration starts.)

Thanks,
Marcel

Comment 10 Dr. David Alan Gilbert 2017-03-07 15:12:45 UTC

Jinchen:
  When you do hotplug, you must always specify the bus and address for *all* PCI/PCIe devices both on the commandline and when you hot-add them.
  If you don't specify it, then when you run the destination with a different command line with the unplugged device missing, other devices will change their auto-allocated slot numbers and the migration will be confused.

Please confirm if you can reproduce the bug when specifying all addresses and busses.

Comment 11 Dr. David Alan Gilbert 2017-03-07 17:38:09 UTC

Marcel: There is a bug here - it looks like the RAMBlock associated with the e1000e isn't being deleted.

see the error: (qemu) qemu-kvm: Unknown ramblock "", cannot accept migration

that empty ramblock name is odd.
I added some debug to dump the list of RAMBlock names at the start of migrate and then did:

./x86_64-softmmu/qemu-system-x86_64 -nographic  -device e1000e,id=foo -m 1G -M pc,accel=kvm my.img

booted Linux
device_del foo

now the e1000e has gone from the 'info pci' but the RAMBlock is still there if I print out the list of RAMBlock's when I start the migrate.

Comment 12 jinchen 2017-03-08 05:35:58 UTC

According to comment 10,with address for *all* PCI/PCIe devices

machine - operation | virtio-nic | e1000   | e1000e  |  virtio block
-------------------------------------------------------------------------
PC - hotplug        |   fail     | fail    |  -----  |     fail  
-------------------------------------------------------------------------
PC - hot-unplug     |   fail     | fail    |  ----   |     fail 
-------------------------------------------------------------------------
Q35 - hotplug       |   fail     |  -----  | fail    |     fail  
-------------------------------------------------------------------------
Q35 - hot-unplug    |   fail     |  ----   | fail    |     fail

Comment 13 Laurent Vivier 2017-03-08 19:56:20 UTC

(In reply to Dr. David Alan Gilbert from comment #11)
> Marcel: There is a bug here - it looks like the RAMBlock associated with the
> e1000e isn't being deleted.
> 
> see the error: (qemu) qemu-kvm: Unknown ramblock "", cannot accept migration
> 
> that empty ramblock name is odd.

Empty ramblock name is set by qemu_ram_unset_idstr():

pci_qdev_unrealize()
-> pci_del_option_rom()
   -> vmstate_unregister_ram()
      -> qemu_ram_unset_idstr()
         -> memset(block->idstr, 0, sizeof(block->idstr));

pci_dev_unrealize() is called on the "device_del", so according to the code, an empty ROM name is what is expected on unpluging a PCI card. I think migration code should not send RAMblock with empty name.

Comment 14 Dr. David Alan Gilbert 2017-03-08 19:59:39 UTC

(In reply to Laurent Vivier from comment #13)
> (In reply to Dr. David Alan Gilbert from comment #11)
> > Marcel: There is a bug here - it looks like the RAMBlock associated with the
> > e1000e isn't being deleted.
> > 
> > see the error: (qemu) qemu-kvm: Unknown ramblock "", cannot accept migration
> > 
> > that empty ramblock name is odd.
> 
> Empty ramblock name is set by qemu_ram_unset_idstr():
> 
> pci_qdev_unrealize()
> -> pci_del_option_rom()
>    -> vmstate_unregister_ram()
>       -> qemu_ram_unset_idstr()
>          -> memset(block->idstr, 0, sizeof(block->idstr));
> 
> pci_dev_unrealize() is called on the "device_del", so according to the code,
> an empty ROM name is what is expected on unpluging a PCI card. I think
> migration code should not send RAMblock with empty name.

The question though is why the RAMBlock isn't deleted rather than just having it's name unset.

There's probably quite a few places we'd have to skip a RAMBlock we wanted to avoid.

Dave

Comment 15 Laurent Vivier 2017-03-09 08:52:09 UTC

This appears in:

commit b0e56e0b63f350691b52d3e75e89bb64143fbeff
Author: Hu Tao <hutao.com>
Date:   Wed Apr 2 15:13:27 2014 +0800

    unset RAMBlock idstr when unregister MemoryRegion
    
    Signed-off-by: Hu Tao <hutao.com>
    Signed-off-by: Paolo Bonzini <pbonzini>

diff --git a/savevm.c b/savevm.c
index da8aa24..7b2c410 100644
--- a/savevm.c
+++ b/savevm.c
@@ -1209,7 +1209,7 @@ void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
 
 void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev)
 {
-    /* Nothing do to while the implementation is in RAMBlock */
+    qemu_ram_unset_idstr(memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK);
 }
 
From
https://lists.nongnu.org/archive/html/qemu-devel/2014-04/msg00282.html

"When hotplug an memdev that was previously plugged and unplugged,
RAMBlock idstr is not cleared and triggers an assert error in
qemu_ram_set_idstr(). This series fixes it."

Comment 16 Laurent Vivier 2017-03-09 11:09:44 UTC

Unplugging the card calls pci_qdev_unrealize(), which unregister the PCI device memory (with do_pci_unregister_device()).

Then qemu_ram_free() is normally called by memory_region_finalize(). But memory_region_finalize() is not called because obj->ref is not 1 (checked in object_unref()).

Comment 17 Laurent Vivier 2017-03-09 12:30:34 UTC

Paolo has fixed the problem with:

http://patchwork.ozlabs.org/patch/736979/

diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index b0f429b..6e23493 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -306,7 +306,7 @@  e1000e_init_msix(E1000EState *s)
 static void
 e1000e_cleanup_msix(E1000EState *s)
 {
-    if (msix_enabled(PCI_DEVICE(s))) {
+    if (msix_present(PCI_DEVICE(s))) {
         e1000e_unuse_msix_vectors(s, E1000E_MSIX_VEC_NUM);
         msix_uninit(PCI_DEVICE(s), &s->msix, &s->msix);
     }

Comment 18 Marcel Apfelbaum 2017-03-13 13:44:31 UTC

 (In reply to jinchen from comment #12)
> According to comment 10,with address for *all* PCI/PCIe devices
> 
> machine - operation | virtio-nic | e1000   | e1000e  |  virtio block
> -------------------------------------------------------------------------
> PC - hotplug        |   fail     | fail    |  -----  |     fail  
> -------------------------------------------------------------------------
> PC - hot-unplug     |   fail     | fail    |  ----   |     fail 
> -------------------------------------------------------------------------
> Q35 - hotplug       |   fail     |  -----  | fail    |     fail  
> -------------------------------------------------------------------------
> Q35 - hot-unplug    |   fail     |  ----   | fail    |     fail

Can you please tests again with brew:
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=12746756

Thanks,
Marcel

Comment 19 jingzhao 2017-03-15 02:31:37 UTC

Hi Marcel

 Also failed with qemu-kvm-rhev-2.8.0-6.el7.test.x86_64 that you provided.

Thanks
Jing

Comment 20 Dr. David Alan Gilbert 2017-03-15 09:07:07 UTC

(In reply to jingzhao from comment #19)
> Hi Marcel
> 
>  Also failed with qemu-kvm-rhev-2.8.0-6.el7.test.x86_64 that you provided.
> 
> Thanks
> Jing

With which error?
What exact command line were you using this time?

Comment 21 jingzhao 2017-03-15 09:55:04 UTC

(In reply to Dr. David Alan Gilbert from comment #20)
> (In reply to jingzhao from comment #19)
> > Hi Marcel
> > 
> >  Also failed with qemu-kvm-rhev-2.8.0-6.el7.test.x86_64 that you provided.
> > 
> > Thanks
> > Jing
> 
> With which error?
> What exact command line were you using this time?

The same error with above
(qemu) qemu-kvm: Unknown ramblock "", cannot accept migration
qemu-kvm: error while loading state for instance 0x0 of device 'ram'
qemu-kvm: load of migration failed: Invalid argument

the qemu command line used:
[1] src host:
/usr/libexec/qemu-kvm \
-M q35 \
-cpu SandyBridge \
-m 4G \
-smp 4,sockets=2,cores=2,threads=1 \
-enable-kvm \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-chardev file,path=/home/seabios.log,id=seabios \
-vga qxl \
-spice port=5931,disable-ticketing \
-qmp tcp:0:4445,server,nowait \
-device ioh3420,id=root.0,slot=1 \
-device e1000e,netdev=dev1,mac=9a:6a:6b:6c:6d:6a,id=net1,bus=root.0 \
-netdev tap,id=dev1,vhost=on \
-device ioh3420,id=root.1,slot=2 \
-drive file=/mnt/test/q35-seabios.qcow2,if=none,id=drive0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=threads \
-device virtio-blk-pci,id=virtio-disk0,drive=drive0,bus=root.1,bootindex=0 \
-monitor stdio \
-vnc :1 \

[2] delete net in src host 
(qemu) netdev_del dev1  
(qemu) device_del net1 

[3] In src host:
(qemu) migrate -d tcp:10.66.4.211:5800
(qemu) info migrate
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off 
Migration status: failed

[4]dest host:
/usr/libexec/qemu-kvm \
-M q35 \
-cpu SandyBridge \
-m 4G \
-smp 4,sockets=2,cores=2,threads=1 \
-enable-kvm \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-chardev file,path=/home/seabios.log,id=seabios \
-vga qxl \
-vnc :0 \
-spice port=5931,disable-ticketing \
-qmp tcp:0:4445,server,nowait \
-device ioh3420,id=root.0,slot=1 \
-device ioh3420,id=root.1,slot=2 \
-drive file=/mnt/test/q35-seabios.qcow2,if=none,id=drive0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=threads \
-device virtio-blk-pci,id=virtio-disk0,drive=drive0,bus=root.1,bootindex=0 \
-monitor stdio \
-vnc :1 \
-incoming tcp:0:5800 \


Thanks
Jing

Comment 22 Dr. David Alan Gilbert 2017-03-15 13:44:17 UTC

I can confirm it still fails with Marcel's rpm, but it does seem to work if I take Paolo's patch and apply it to upstream qemu.
Marcel what was in that build?

Comment 25 Marcel Apfelbaum 2017-03-29 13:50:49 UTC

(In reply to Dr. David Alan Gilbert from comment #22)
> I can confirm it still fails with Marcel's rpm, but it does seem to work if
> I take Paolo's patch and apply it to upstream qemu.
> Marcel what was in that build?

Well, it was supposed to be the latest qemu-kvm-rhev with Paolo's patch applied, now I am not sure anymore...

Thanks,
Marcel

Comment 26 Laurent Vivier 2017-04-06 07:40:02 UTC

Set to post as Paolo's patch has landed in v2.9.0-rc0.

Comment 27 jingzhao 2017-04-26 06:23:09 UTC

1.Reproduce the bz on qemu-kvm-rhev-2.8.0-6.el7.x86_64

2.Also failed on qemu-kvm-rhev-2.9.0-1.el7.x86_64 & kernel-3.10.0-657.el7.x86_64

Following is the detailed info:

1) boot guest with qemu command line [1]

2) unplug e1000e device in the src
(qemu) device_del net1  
(qemu) netdev_del dev1 

3) do the migration in the src
(qemu) migrate -d tcp:10.66.6.246:5800
(qemu) info migrate
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off release-ram: off 
Migration status: active
total time: 2002 milliseconds
expected downtime: 300 milliseconds
setup: 13 milliseconds
transferred ram: 68846 kbytes
throughput: 268.58 mbps
remaining ram: 3439744 kbytes
total ram: 4325840 kbytes
duplicate: 204795 pages
skipped: 0 pages
normal: 16729 pages
normal bytes: 66916 kbytes
dirty sync count: 1
(qemu) mig
migrate                 migrate_cancel          migrate_incoming        
migrate_set_cache_size  migrate_set_capability  migrate_set_downtime    
migrate_set_parameter   migrate_set_speed       migrate_start_postcopy  
(qemu) migrate_se
migrate_set_cache_size  migrate_set_capability  migrate_set_downtime    
migrate_set_parameter   migrate_set_speed       
(qemu) migrate_set_downtime 1  
(qemu) migrate_set_speed 1G
(qemu) info migrate
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off release-ram: off 
Migration status: completed
total time: 9218 milliseconds
downtime: 189 milliseconds
setup: 13 milliseconds
transferred ram: 1331923 kbytes
throughput: 1183.84 mbps
remaining ram: 0 kbytes
total ram: 4325840 kbytes
duplicate: 761749 pages
skipped: 0 pages
normal: 330661 pages
normal bytes: 1322644 kbytes
dirty sync count: 3

4) check the status in dest:
(qemu) red_dispatcher_loadvm_commands: 
qemu-kvm: Unknown savevm section or instance '0000:00:02.0/pcie-root-port' 0
qemu-kvm: load of migration failed: Invalid argument

[1]
/usr/libexec/qemu-kvm \
-M q35 \
-cpu SandyBridge \
-m 4G \
-smp 4,sockets=2,cores=2,threads=1 \
-enable-kvm \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-chardev file,path=/home/seabios.log,id=seabios \
-vga qxl \
-vnc :0 \
-qmp tcp:0:4445,server,nowait \
-device pcie-root-port,id=root.0,slot=1 \
-device e1000e,netdev=dev1,mac=9a:6a:6b:6c:6d:6a,id=net1,bus=root.0 \
-netdev tap,id=dev1,vhost=on \
-device pcie-root-port,id=root.1,slot=2 \
-drive file=/home/test/rhel/rhel74.qcow2,if=none,id=drive0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=threads \
-device virtio-blk-pci,id=virtio-disk0,drive=drive0,bus=root.1,bootindex=0 \
-monitor stdio \


According to above info, change the bz status to assign

Comment 28 Laurent Vivier 2017-04-26 08:33:53 UTC

(In reply to jingzhao from comment #27)
> 1.Reproduce the bz on qemu-kvm-rhev-2.8.0-6.el7.x86_64
> 
> 2.Also failed on qemu-kvm-rhev-2.9.0-1.el7.x86_64 &
> kernel-3.10.0-657.el7.x86_64
> 
> Following is the detailed info:
> 
> 1) boot guest with qemu command line [1]
...
> 4) check the status in dest:
> (qemu) red_dispatcher_loadvm_commands: 
> qemu-kvm: Unknown savevm section or instance '0000:00:02.0/pcie-root-port' 0
> qemu-kvm: load of migration failed: Invalid argument

You don't test with the same command line, it looks like another bug, on pcie, not e1000.

Comment 30 Dr. David Alan Gilbert 2017-04-27 09:28:43 UTC

Hi Jing,
  Please test it with the ioh3420 as per the original bug and make sure the original bug is fixed.
  Please test again with the pcie-root-port and file a separate bug if that fails - I can't reproduce it here.

Comment 31 Dr. David Alan Gilbert 2017-04-27 09:37:27 UTC

Actually, I *can* reproduce this - I'll file a separate bz for it

Comment 32 jingzhao 2017-04-27 09:42:34 UTC

(In reply to Dr. David Alan Gilbert from comment #31)
> Actually, I *can* reproduce this - I'll file a separate bz for it

I had filed a new bz for tracking the new bz (bz 1446080). 

Thanks
Jing

Comment 33 jingzhao 2017-04-27 09:54:01 UTC

(In reply to Dr. David Alan Gilbert from comment #30)
> Hi Jing,
>   Please test it with the ioh3420 as per the original bug and make sure the
> original bug is fixed.
>   Please test again with the pcie-root-port and file a separate bug if that
> fails - I can't reproduce it here.

Hi David

Also failed with ioh3420 device, the failed info is same with the pcie-root-port device. following is the failed info

(qemu) red_dispatcher_loadvm_commands: 
qemu-kvm: Unknown savevm section or instance '0000:00:02.0/ioh-3240-express-root-port' 0
qemu-kvm: load of migration failed: Invalid argument

The different failed info, can we conside the bz fixed? 


Thanks
Jing

Comment 34 Dr. David Alan Gilbert 2017-04-27 10:38:45 UTC

(In reply to jingzhao from comment #33)
> (In reply to Dr. David Alan Gilbert from comment #30)
> > Hi Jing,
> >   Please test it with the ioh3420 as per the original bug and make sure the
> > original bug is fixed.
> >   Please test again with the pcie-root-port and file a separate bug if that
> > fails - I can't reproduce it here.
> 
> Hi David
> 
> Also failed with ioh3420 device, the failed info is same with the
> pcie-root-port device. following is the failed info
> 
> (qemu) red_dispatcher_loadvm_commands: 
> qemu-kvm: Unknown savevm section or instance
> '0000:00:02.0/ioh-3240-express-root-port' 0
> qemu-kvm: load of migration failed: Invalid argument
> 
> The different failed info, can we conside the bz fixed? 
> 
> 
> Thanks
> Jing

Jing
  You must specify 'addr=' on all PCI and PCIe devices to ensure they keep the same address on source/destination, e.g. for:

-device pcie-root-port,id=root.0,slot=1 \

you need

-device pcie-root-port,id=root.0,slot=1,addr=4 

(I picked 4, but you need to find a free number).

Comment 35 jingzhao 2017-04-28 01:44:51 UTC

Thanks David's help

According to comment 34 and migrated successfully when pcie with addr parameter, so according to comment 27, comment 34 can be verified the bz on  qemu-kvm-rhev-2.9.0-1.el7.x86_64.

Thanks
Jing

Comment 37 errata-xmlrpc 2017-08-01 23:44:45 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 38 errata-xmlrpc 2017-08-02 01:22:26 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 39 errata-xmlrpc 2017-08-02 02:14:23 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 40 errata-xmlrpc 2017-08-02 02:55:11 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 41 errata-xmlrpc 2017-08-02 03:19:49 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Comment 42 errata-xmlrpc 2017-08-02 03:37:31 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:2392

Note You need to log in before you can comment on or make changes to this bug.