Bug 1040840

Summary: migration failed after device hot-unplug
Product: Red Hat Enterprise Linux 7 Reporter: mazhang <mazhang>
Component: qemu-kvmAssignee: Luiz Capitulino <lcapitulino>
Status: CLOSED NOTABUG QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0CC: acathrow, amit.shah, dgilbert, flang, hhuang, juzhang, michen, quintela, qzhang, rhod, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1068694 (view as bug list) Environment:
Last Closed: 2014-03-03 14:45:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description mazhang 2013-12-12 08:38:39 UTC
Description of problem:
migration failed after hot unplug virtio balloon

Version-Release number of selected component (if applicable):

Host:
qemu-kvm-rhev-1.5.3-21.el7.x86_64
kernel-3.10.0-57.el7.x86_64

Gust:
rhel7-64/win8.1-32

How reproducible:
always

Steps to Reproduce:
1.boot vm on src host with memory balloon device:
/usr/libexec/qemu-kvm \
-M pc \
-cpu SandyBridge \
-m 2G \
-smp 4,sockets=2,cores=2,threads=1,maxcpus=16 \
-enable-kvm \
-name win8-32 \
-uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \
-smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 \
-k en-us \
-rtc base=localtime,clock=host,driftfix=slew \
-nodefaults \
-monitor stdio \
-qmp tcp:0:6666,server,nowait \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-drive file=/home/rhel7-64.qcow2,if=none,id=drive-data-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop \
-device virtio-blk-pci,bus=pci.0,addr=0x7,scsi=off,drive=drive-data-disk,id=data-disk \
-vga std \
-vnc :0 \
-device virtio-balloon-pci,bus=pci.0,id=balloon0 \
-netdev tap,id=hostnet0,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:01:01:ef \

2.Hot unplug balloon device
(qemu) device_del balloon0
(qemu) info balloon 
Device 'balloon' has not been activated

3.Start qemu-kvm process on des host *without* memory balloon device for migration.
...
-drive file=/mnt/rhel7-64.qcow2,if=none,id=drive-data-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop \
-device virtio-blk-pci,bus=pci.0,addr=0x7,scsi=off,drive=drive-data-disk,id=data-disk \
-vga std \
-vnc :0 \
-incoming tcp:0:5800 \
-netdev tap,id=hostnet0,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:01:01:ef \

4.Start migration on src host:
(qemu) migrate -d tcp:10.66.106.40:5800

Actual results:
Migration failed, qemu-kvm on destination host quit:

QEMU 1.5.3 monitor - type 'help' for more information
(qemu) info status
VM status: paused (inmigrate)
(qemu) Unknown ramblock "0000:00:04.0/virtio-net-pci.rom", cannot accept migration
qemu: warning: error while loading state for instance 0x0 of device 'ram'

Expected results:
Migration success

Additional info:

Comment 2 langfang 2014-01-02 06:29:19 UTC
Hit the problem when  boot guest with vhost=on ,migrate to without "vhost=on"

Host:
3.10.0-64.el7.x86_64
qemu-kvm-rhev-1.5.3-30.el7.x86_64

Guest:
3.10.0-64.el7.x86_64


steps:
1.boot guest with "vhost=on"
2.start listen mode without "vhost=on"

...-incoming tcp:0:5999
3.migrate


results:
...
server,nowait -incoming tcp:0:5999
QEMU 1.5.3 monitor - type 'help' for more information
(qemu) qemu: warning: error while loading state for instance 0x0 of device 'ram'
load of migration failed

Comment 3 Luiz Capitulino 2014-01-02 14:09:57 UTC
(In reply to langfang from comment #2)
> Hit the problem when  boot guest with vhost=on ,migrate to without "vhost=on"

But in this case I think this is wrong, as the target guest has to be the same as the source guest.

Comment 4 Luiz Capitulino 2014-01-02 14:15:04 UTC
I did some testing on this and found two interesting things:

1. This also happens with virtio-rng-pci, so it's not specific to virtio-balloon. Would be nice to try with other virtio devices and non-virtio devices too

2. If I start the source guest w/o the virtio-balloon device, wait for the guest to boot and then hotplug the virtio-balloon device and then unplug it, migration works

Comment 5 Luiz Capitulino 2014-02-21 17:10:19 UTC
This is an important one, but right now we're on blockers phase for RHEL7.0 and this one seems to have always existed, so it's not regression (see bug 1068694). Moving to RHEL7.1 then.

Comment 6 Luiz Capitulino 2014-02-26 18:40:05 UTC
I'm not sure if I'm missing something here, but I'm under the impression that migration just doesn't support device hotplug/unplug and that looks more as a design limitation than a bug.

I'll provide all details I have below. Juan, could you please review my thinking and give your input?

The root of all problem is that register_savevm() automatically assigns a unique "idstr" to devices that contain the device's path in qdev/QOM. This unique "idstr" created by register_savevm() is composed of "qdev-path/device-name", like "0000:00:05.0/virtio-blk". A device's idstr has to match in the source and destination guest in order for migration to work. However, when you play with device hotplug/unplug in the source, you may change device ordering, which changes idstrs and that breaks migration on the destination guest.

Here goes two use-cases that don't work, all tested with latest upstream qemu (HEAD aa0d1f44887):

A. Starting the destination guest *without* a device that has been hot-unplugged from the source guest (this BZ). Reproducer:

1. Start the source guest with a virtio device:

# ./qemu -drive file=disks/test.img,if=virtio,cache=none,aio=native \
  -enable-kvm -m 1G -monitor stdio -device virtio-balloon,id=bal

2. Remove it after the guest boots:

 (qemu) device_del bal

3. Start the destination host:

# ./qemu -drive file=disks/test.img,if=virtio,cache=none,aio=native \
  -enable-kvm -m 1G -monitor stdio -incoming tcp:0:4444

4. Migrate the source guest:

 (qemu) migrate tcp:0:4444

5. Destination guest fails:

Unknown savevm section or instance '0000:00:05.0/virtio-blk' 0
load of migration failed

B. Migrating after a new device has been hot-plugged to the source guest. Note that there two ways of doing this, one could start the destination guest with or without the new device. Doesn't matter, both ways fail. Reproducer:

1. Start the source guest:

# ./qemu -drive file=disks/test.img,if=virtio,cache=none,aio=native \
  -enable-kvm -m 1G -monitor stdio

2. Hot-plug new balloon device:

 (qemu) device_add virtio-balloon,id=bal

3. Start the destination host:

# ./qemu -drive file=disks/test.img,if=virtio,cache=none,aio=native \
  -enable-kvm -m 1G -monitor stdio -incoming tcp:0:4444 \
  -device virtio_balloon,id=bal

4. Migrate the source guest:

 (qemu) migrate tcp:0:4444

5. Destination will fail as in the use-case A

What works:

A. Hot-unplugging all hot-plugged devices from the source guest before migrating

B. Starting the destination guest with a device that has been hot-unplugged from the source

Comment 7 Amit Shah 2014-02-27 05:41:05 UTC
(In reply to Luiz Capitulino from comment #6)
> I'm not sure if I'm missing something here, but I'm under the impression
> that migration just doesn't support device hotplug/unplug and that looks
> more as a design limitation than a bug.

No, this is indeed supposed to work.

The unregister_savevm() call in the device's unrealize function ensures that.

> I'll provide all details I have below. Juan, could you please review my
> thinking and give your input?
> 
> The root of all problem is that register_savevm() automatically assigns a
> unique "idstr" to devices that contain the device's path in qdev/QOM. This
> unique "idstr" created by register_savevm() is composed of
> "qdev-path/device-name", like "0000:00:05.0/virtio-blk". A device's idstr
> has to match in the source and destination guest in order for migration to
> work. However, when you play with device hotplug/unplug in the source, you
> may change device ordering, which changes idstrs and that breaks migration
> on the destination guest.

Yes, this is a new addition for RHEL7.  In RHEL6, the qdev path wasn't used for the idstr, so we didn't have this problem.

Note that this won't happen when libvirt initializes the devices, or if libvirt is used to start migration.

That makes this a low-priority bug.

> Here goes two use-cases that don't work, all tested with latest upstream
> qemu (HEAD aa0d1f44887):
> 
> A. Starting the destination guest *without* a device that has been
> hot-unplugged from the source guest (this BZ). Reproducer:
> 
> 1. Start the source guest with a virtio device:
> 
> # ./qemu -drive file=disks/test.img,if=virtio,cache=none,aio=native \
>   -enable-kvm -m 1G -monitor stdio -device virtio-balloon,id=bal
> 
> 2. Remove it after the guest boots:
> 
>  (qemu) device_del bal
> 
> 3. Start the destination host:
> 
> # ./qemu -drive file=disks/test.img,if=virtio,cache=none,aio=native \
>   -enable-kvm -m 1G -monitor stdio -incoming tcp:0:4444
> 
> 4. Migrate the source guest:
> 
>  (qemu) migrate tcp:0:4444
> 
> 5. Destination guest fails:
> 
> Unknown savevm section or instance '0000:00:05.0/virtio-blk' 0
> load of migration failed

Right, so virtio-balloon is placed at 0000:00:04.0/virtio-balloon on the src.  On the dest, since virtio-balloon doesn't exist, virtio-blk is placed at 0000:00:04.0/virtio-blk.

One workaround is to start with the exact same cmdline and replay all the hot-plug/unplug sequences on the dest before starting migration.

> B. Migrating after a new device has been hot-plugged to the source guest.
> Note that there two ways of doing this, one could start the destination
> guest with or without the new device. Doesn't matter, both ways fail.

It would matter: if the device is hot-plugged on both src and dest, the device would end up getting the same idstrs for all the devices.

> What works:
> 
> A. Hot-unplugging all hot-plugged devices from the source guest before
> migrating
> 
> B. Starting the destination guest with a device that has been hot-unplugged
> from the source

Yep.

So:

Use the exact same cmdline for both the src and dest, and replay any hot-plug and unplug sequences on the dest before starting migration

OR

Use explicit device ids like libvirt uses, to ensure devices are placed at deterministic locations.

I'm inclined to WONTFIX this, will let Juan comment.

Comment 8 Dr. David Alan Gilbert 2014-02-27 09:04:37 UTC
I think I'd agree, using explicit device ids is the safe way to do it.
(Whether we could find a way of warning about it is a different matter; we can't just warn on the lack of explicit device ids because it's so common, maybe a warning when hot-plugging on a setup without explicit IDs would be worth it).

Comment 9 Luiz Capitulino 2014-02-27 14:16:36 UTC
Thinking more as an user than a developer, I'd have expected this just to work. But I do agree that if this is known to work this way and if libvirt works it around, then we should just close this as WONTFIX.

Juan, can you please take the final decision here?

Now, a couple of things that may not be important:

(In reply to Amit Shah from comment #7)
> (In reply to Luiz Capitulino from comment #6)
> > I'm not sure if I'm missing something here, but I'm under the impression
> > that migration just doesn't support device hotplug/unplug and that looks
> > more as a design limitation than a bug.
> 
> No, this is indeed supposed to work.

When I say "work" in this context I actually mean "to work out of the box", which may be too naive. You seem to mean that "it does work if you do the right thing".

> The unregister_savevm() call in the device's unrealize function ensures that.

Yes, it ensures that the device being unplugged gets removed from the savevm list.

> > I'll provide all details I have below. Juan, could you please review my
> > thinking and give your input?
> > 
> > The root of all problem is that register_savevm() automatically assigns a
> > unique "idstr" to devices that contain the device's path in qdev/QOM. This
> > unique "idstr" created by register_savevm() is composed of
> > "qdev-path/device-name", like "0000:00:05.0/virtio-blk". A device's idstr
> > has to match in the source and destination guest in order for migration to
> > work. However, when you play with device hotplug/unplug in the source, you
> > may change device ordering, which changes idstrs and that breaks migration
> > on the destination guest.
> 
> Yes, this is a new addition for RHEL7.  In RHEL6, the qdev path wasn't used
> for the idstr, so we didn't have this problem.

You're right that this code doesn't exist in RHEL6, but I was able to get the reporter's original problem on RHEL6 (see bug 1068694). Now, I *guess* I may have triggered the problem that the new idstr code fixes *if* you do the right thing (which is what libvirt seems to do).

> Note that this won't happen when libvirt initializes the devices, or if
> libvirt is used to start migration.
> 
> That makes this a low-priority bug.

Agreed.

> > Here goes two use-cases that don't work, all tested with latest upstream
> > qemu (HEAD aa0d1f44887):
> > 
> > A. Starting the destination guest *without* a device that has been
> > hot-unplugged from the source guest (this BZ). Reproducer:
> > 
> > 1. Start the source guest with a virtio device:
> > 
> > # ./qemu -drive file=disks/test.img,if=virtio,cache=none,aio=native \
> >   -enable-kvm -m 1G -monitor stdio -device virtio-balloon,id=bal
> > 
> > 2. Remove it after the guest boots:
> > 
> >  (qemu) device_del bal
> > 
> > 3. Start the destination host:
> > 
> > # ./qemu -drive file=disks/test.img,if=virtio,cache=none,aio=native \
> >   -enable-kvm -m 1G -monitor stdio -incoming tcp:0:4444
> > 
> > 4. Migrate the source guest:
> > 
> >  (qemu) migrate tcp:0:4444
> > 
> > 5. Destination guest fails:
> > 
> > Unknown savevm section or instance '0000:00:05.0/virtio-blk' 0
> > load of migration failed
> 
> Right, so virtio-balloon is placed at 0000:00:04.0/virtio-balloon on the
> src.  On the dest, since virtio-balloon doesn't exist, virtio-blk is placed
> at 0000:00:04.0/virtio-blk.
> 
> One workaround is to start with the exact same cmdline and replay all the
> hot-plug/unplug sequences on the dest before starting migration.

Right.

> > B. Migrating after a new device has been hot-plugged to the source guest.
> > Note that there two ways of doing this, one could start the destination
> > guest with or without the new device. Doesn't matter, both ways fail.
> 
> It would matter: if the device is hot-plugged on both src and dest, the
> device would end up getting the same idstrs for all the devices.
> 
> > What works:
> > 
> > A. Hot-unplugging all hot-plugged devices from the source guest before
> > migrating
> > 
> > B. Starting the destination guest with a device that has been hot-unplugged
> > from the source
> 
> Yep.
> 
> So:
> 
> Use the exact same cmdline for both the src and dest, and replay any
> hot-plug and unplug sequences on the dest before starting migration
> 
> OR
> 
> Use explicit device ids like libvirt uses, to ensure devices are placed at
> deterministic locations.
> 
> I'm inclined to WONTFIX this, will let Juan comment.

Agreed.

Comment 10 Luiz Capitulino 2014-02-27 14:20:12 UTC
(In reply to Dr. David Alan Gilbert from comment #8)
> I think I'd agree, using explicit device ids is the safe way to do it.
> (Whether we could find a way of warning about it is a different matter; we
> can't just warn on the lack of explicit device ids because it's so common,
> maybe a warning when hot-plugging on a setup without explicit IDs would be
> worth it).

I think we should document this in docs/migration.txt. We should describe how it works currently, why it works like that and what's the proper way to use device hotplug/unplug with migration.

Comment 11 Juan Quintela 2014-03-03 14:30:33 UTC
This can't work.  You can have:
- don't use hot [un]plug, and everything goes well
- be very carefull

what is happening here is that depending on command line, one device ends being on pci slot 4 or pci slot 5.  That is guess visible (lspci and you see that your network device has switched slots).   So it is not something that we can really fix.  The only real thing we can do is:
- do like libvirt, and just place the devices explicitely in a pci slotw
- don't use hot [un]plug.

The problem here being that the two command lines don't place the devices on the same position because some devices are missing.

I will declare it as a NOTABUG.  That this used to work on the past is just a BUG, and anyways this is guest visible, so we can't really migrate (I would suppose that this require windows re-activation, but I am not a complete guru to that).

Comment 12 Luiz Capitulino 2014-03-03 14:45:44 UTC
OK, I'm totally convinced now. I think it's worth it to document this upstream, but that's minor. Closing as NOTABUG.

Comment 13 Ronen Hod 2014-03-10 10:29:03 UTC
(In reply to Juan Quintela from comment #11)
> This can't work.  You can have:
> - don't use hot [un]plug, and everything goes well
> - be very carefull
> 
> what is happening here is that depending on command line, one device ends
> being on pci slot 4 or pci slot 5.  That is guess visible (lspci and you see
> that your network device has switched slots).   So it is not something that
> we can really fix.  The only real thing we can do is:
> - do like libvirt, and just place the devices explicitely in a pci slotw
> - don't use hot [un]plug.
> 
> The problem here being that the two command lines don't place the devices on
> the same position because some devices are missing.
> 
> I will declare it as a NOTABUG.  That this used to work on the past is just
> a BUG, and anyways this is guest visible, so we can't really migrate (I
> would suppose that this require windows re-activation, but I am not a
> complete guru to that).

BTW, we have the same issue with S4 (naturally, with Windows guests that have to run tests using S4).
When Resumed, the previous PCI slots allocation changes due to automatic placement of hotplugged/unplugged devices.
I agree, at least for now. The PCI-slot for every device (incl. hotplug) needs to be explicit (the way libvirt works).