Bug 808680 - Guest can't resume from S4 when non-persistent device has been hot plugged
Guest can't resume from S4 when non-persistent device has been hot plugged
Status: CLOSED DUPLICATE of bug 827499
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
6.3
x86_64 Linux
unspecified Severity medium
: rc
: ---
Assigned To: Igor Mammedov
Virtualization Bugs
:
Depends On:
Blocks: 720669 761491 808463
  Show dependency treegraph
 
Reported: 2012-03-31 03:47 EDT by FuXiangChun
Modified: 2013-02-18 05:20 EST (History)
22 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-09-18 11:48:24 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description FuXiangChun 2012-03-31 03:47:57 EDT
Description of problem:
the first hotplug a vcpu to windows2008r2 guest. after then do s4 inside guest.
guest will show BSOD when guest resuming.


Version-Release number of selected component (if applicable):
# rpm -qa|grep qemu
qemu-kvm-0.12.1.2-2.267.el6.x86_64

# uname -r
2.6.32-259.el6.x86_64

guest:
windows2008R2 (datacenter)

How reproducible:
100%

Steps to Reproduce:
1.CLI
/usr/libexec/qemu-kvm -M rhel6.3.0 -cpu Opteron_G3,-kvmclock --enable-kvm -m 4G -smp 1,cores=4,sockets=40,threads=1,maxcpus=161 -name rhel6.3 -uuid ddcbfb49-3411-1701-3c36-6bdbc00bedbb -rtc base=utc,clock=host,driftfix=slew -boot c -drive file=/dev/vg-qzhang/lv-2k8r2,if=none,id=ide,format=raw,cache=none,werror=stop,rerror=stop,boot=on -device ide-drive,drive=ide,id=drive-ide0-0-0,bootindex=1 -netdev tap,id=hostnet1 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=32:12:50:a4:32:74 -spice port=5914,disable-ticketing -vga qxl -device sga -chardev socket,id=serial0,path=/var/test3,server,nowait -device isa-serial,chardev=serial0 -monitor unix:/tmp/monitor4,server,nowait -monitor stdio -qmp tcp:0:6667,server,nowait

2.hotplug a cpu to guest from monitor
cpu_set 1 online

3.do s4 and resume guest

 
Actual results:
guest BSOD

Expected results:
guest resume successfully

Additional info:
if I don't hotplug cpu then guest can resume successfully
Comment 1 FuXiangChun 2012-03-31 03:48:47 EDT
this is BSOD dump file
ftp://10.66.11.212/pub/win2008r2-s4-MEMORY.DMP
Comment 3 Igor Mammedov 2012-04-02 09:48:49 EDT
(In reply to comment #0)
> 3.do s4 and resume guest

Could you show me how you do it?
Comment 4 Igor Mammedov 2012-04-03 09:28:37 EDT
Dave,

Is libvirt capable to resume guest from S4 with hot-plugged cpus (even if they were --live hot-plugged)?
Comment 5 Dave Allan 2012-04-04 23:26:56 EDT
(In reply to comment #4)
> Is libvirt capable to resume guest from S4 with hot-plugged cpus (even if they
> were --live hot-plugged)?

Michal, can you answer that one?
Comment 6 FuXiangChun 2012-04-05 22:07:40 EDT
(In reply to comment #3)
> (In reply to comment #0)
> > 3.do s4 and resume guest
> 
> Could you show me how you do it?

1.press ctrl-alt-delete inside guest and choice Hibernate from shut down options.
2.after step2, re-start guest with the same command line
Comment 7 Gleb Natapov 2012-04-06 02:49:14 EDT
(In reply to comment #6)
> (In reply to comment #3)
> > (In reply to comment #0)
> > > 3.do s4 and resume guest
> > 
> > Could you show me how you do it?
> 
> 1.press ctrl-alt-delete inside guest and choice Hibernate from shut down
> options.
> 2.after step2, re-start guest with the same command line
Your step 2 is incorrect. New command line should reflect new amount of cpus after hot plug/unplug operation.
Comment 8 Michal Privoznik 2012-04-06 05:32:05 EDT
Igor, I don't think so. There is fundamental problem with S4. Basically, when guest enters S4, qemu emits the very same event(s) like for guest doing bare shutdown. So it is not possible to tell if guest has entered S4 or has been halted. I don't know if it is possible to tell even for real hosts unless you literally watch monitor. Okay, for S4 initiated from outside it might be possible, but there is still the one initiated within guest. Anyway, we need to learn qemu to report that and libvirt to keep live XML over SHUTDOWN.
Comment 9 Dave Allan 2012-04-09 09:58:49 EDT
(In reply to comment #8)
> Igor, I don't think so. There is fundamental problem with S4. Basically, when
> guest enters S4, qemu emits the very same event(s) like for guest doing bare
> shutdown. So it is not possible to tell if guest has entered S4 or has been
> halted. I don't know if it is possible to tell even for real hosts unless you
> literally watch monitor. Okay, for S4 initiated from outside it might be
> possible, but there is still the one initiated within guest. Anyway, we need to
> learn qemu to report that and libvirt to keep live XML over SHUTDOWN.

Given that, I think this BZ needs to stay on qemu as an RFE for the s4 reporting and it needs to be cloned to libvirt to understand the s4 event.  Anybody disagree?
Comment 10 Ademar Reis 2012-04-09 10:39:10 EDT
There's too much tech-preview in the s4+cpuhotplug combo. Postponing to 6.4.
Comment 11 Igor Mammedov 2012-04-23 12:04:25 EDT
Adding Luiz to consider possibility of using guest agent for fixing issue.

For using ga we probably would need to implement suspend-to-disk command so that libvirt could issue it and on successful command completion save hot-plugged devices.

But what to do when S4 is originated from guest using guests' native mechanism?
Could guest agent detect it somehow (linux and windows) and notify libvirt about it?
Comment 12 Luiz Capitulino 2012-04-23 15:02:48 EDT
I don't think the agent can do anything about this. Also, notifications are not 100% reliable, as they're racy or can be missed (eg. libvirt connects to a qemu that started entering S4 one second ago).

What's libvirt behavior when some devices are added, the VM shuts down and is later started again? Does libvirt forget about the added devices or is it going to start the VM with the new added devices by default?

Making the added devices part of the VM default config seems to be the easier way to solve this.
Comment 13 Luiz Capitulino 2012-04-23 15:07:43 EDT
Fix bug summary. Let's not mix "bug summary" with "my proposal for a solution".
Comment 14 Igor Mammedov 2012-04-24 04:46:42 EDT
Fix bug summary. It's not that only cpu hot-plug is affected. The other hot-plugged devices are affected as well if they hot-plugged without --persistent option.
Comment 15 Igor Mammedov 2012-04-24 05:01:29 EDT
Luiz,

From what I've seen libvirt forgets about hot-plugged devices when guest shutdowns. And there might be users of the existing non-persistent hot-plug interface. Ditching it probably is not an option.

So we will need a 'good enough' way to notify libvirt about S4 event.

Supporting suspend-to-disk command in libvirt+ga might be the one, because if libvirt will issue suspend-to-disk cmd and even loose S4 notification it might assume that command executed correctly and preserve hot-plugged devices for the next guest start on qemu exit.
Comment 16 Igor Mammedov 2012-04-24 05:13:30 EDT
Accordig to Gleb (http://lists.gnu.org/archive/html/qemu-devel/2012-04/msg03090.html), We can't rely on ACPI's S4 as well. But we can the best possible and implement both ACPI and libvirt+GA approaches.
Comment 17 Michal Privoznik 2012-04-25 04:55:07 EDT
There are two different approaches from libvirt POV when hot plugging new device (be it a PCI/USB device, CPU, whatever). The first one is persistent - devices are assigned to the guest every time the guest start (=qemu process is to be started). The second one is so called transient - the device is hot plugged until guest shutdown. On qemu process termination, devices are returned back to the host and forgotten they have ever been attached to the guest. Therefore any next boot doesn't bring them up. And this is the root cause of this I think. If a guest is entering S4 (not real shutdown, but qemu process termination) we need to remember all devices assigned - even the transient ones. However, I don't think there exists a way of determining that guest is entering S4 and not just doing bare S4 as long as some OSes really do power down on S4 instead of using DSDT.

On the other hand, it is perfectly possible to detach a device on real host during S4 and host OS should be capable to deal with it. For instance, I can let my host enter S4, detach all USB devices, resume and guest OS has to deal with it (other way than BSOD).
Comment 18 Amit Shah 2012-04-25 05:54:33 EDT
(In reply to comment #17)
> However, I don't think there exists a way of determining that guest is
> entering S4 and not just doing bare S4 as long as some OSes really do power
> down on S4 instead of using DSDT.

That is right.  No way for the host to know.  If a guest agent exists, it could send a message just before freezing, but this will need guest kernel support (I don't think userspace processes are allowed to do a one last thing before freeze).  So this just seems unlikely.

Maybe we'll just have to tell people that transient hot-plugs don't go well with S4 (or, at least, guest-initiated S4).

> On the other hand, it is perfectly possible to detach a device on real host
> during S4 and host OS should be capable to deal with it. For instance, I can
> let my host enter S4, detach all USB devices, resume and guest OS has to deal
> with it (other way than BSOD).

Different OSes will behave differently in such situations, and there's a risk of data loss associated with block devices getting unplugged w/o OS knowledge.
Comment 19 Daniel Berrange 2012-04-25 11:50:39 EDT
> Maybe we'll just have to tell people that transient hot-plugs don't go well
> with S4 (or, at least, guest-initiated S4).

I think this is probably the most practical way forward. 

Using S4 hibernate in the first place with VMs is somewhat of a solution in search of a problem. Hypervisor initiated save-to-disk avoids this kind of problem, is resilient against poor guest OS ACPI implementations, and doesn't require the guest to have setup a huge swap partition for the saved image (No guest I ever provision has a large enough swap to use S4).

Combined with the fact that transient hotplug with persistent guests is not a critical feature - most hypervisors don't even offer this a an option, and enterprise apps like oVirt use entirely transient guests so they don't hit the issue.

So I say just document it as a limitation.
Comment 20 Amit Shah 2012-04-25 12:38:44 EDT
(In reply to comment #19)
> Using S4 hibernate in the first place with VMs is somewhat of a solution in
> search of a problem.

The only advantage worth mentioning so far is that time keeping is stable across long pauses.  With S4, the guest actually knows it's going to pause for a long while.  With save-to-disk, the guest isn't aware, and it could take a while for the guest to recover (and that too depends on the guest type and how it's configured).
Comment 21 Daniel Berrange 2012-04-25 12:45:35 EDT
> The only advantage worth mentioning so far is that time keeping is stable
> across long pauses.

Didn't Xen do have paravirtualized hook to reset the time keeping upon restore ? Could it be something that we could thus apply to KVM in a similar way ?
Comment 22 Igor Mammedov 2012-04-26 03:22:10 EDT
May be we just shouldn't offer S4 at all and avoid opening bugs by customers in the future?
Comment 23 Luiz Capitulino 2012-04-26 10:13:00 EDT
(In reply to comment #17)

> On the other hand, it is perfectly possible to detach a device on real host
> during S4 and host OS should be capable to deal with it. For instance, I can
> let my host enter S4, detach all USB devices, resume and guest OS has to deal
> with it (other way than BSOD).

I think this is a good point, and makes me wonder if we have tested this scenario well for the devices we support. Maybe the OS can deal with this well enough, but there might be edge cases like vCPU hotplug.

PS: CPU hotplug seems to be an exotic feature even in the non-virtual world, I wouldn't be surprised if OSes really don't support this...
Comment 24 Igor Mammedov 2012-04-27 08:18:51 EDT
(In reply to comment #23)
> (In reply to comment #17)
> 
> > On the other hand, it is perfectly possible to detach a device on real host
> > during S4 and host OS should be capable to deal with it. For instance, I can
> > let my host enter S4, detach all USB devices, resume and guest OS has to deal
> > with it (other way than BSOD).
> 
> I think this is a good point, and makes me wonder if we have tested this
> scenario well for the devices we support. Maybe the OS can deal with this well
> enough, but there might be edge cases like vCPU hotplug.
Sudden disappearance of hot-plugged pci nic/disk doesn't make linux happy (that's why I renamed bug from cpu to general hotplug+s4 issue). 
Probably pci devices/cpus are expected not to disappear without prior notice.
Comment 25 Gleb Natapov 2012-05-03 04:17:02 EDT
(In reply to comment #24)
> (In reply to comment #23)
> > (In reply to comment #17)
> > 
> > > On the other hand, it is perfectly possible to detach a device on real host
> > > during S4 and host OS should be capable to deal with it. For instance, I can
> > > let my host enter S4, detach all USB devices, resume and guest OS has to deal
> > > with it (other way than BSOD).
> > 
> > I think this is a good point, and makes me wonder if we have tested this
> > scenario well for the devices we support. Maybe the OS can deal with this well
> > enough, but there might be edge cases like vCPU hotplug.
> Sudden disappearance of hot-plugged pci nic/disk doesn't make linux happy
> (that's why I renamed bug from cpu to general hotplug+s4 issue). 
> Probably pci devices/cpus are expected not to disappear without prior notice.

Device that can be unplugged can disappear during S4. This happens all the time when laptop is suspended in a docking station and resumes outside of it. But Windows does not support CPU hot-unplug at all, so it can't handle it during S4 either.
Comment 27 RHEL Product and Program Management 2012-07-10 04:36:23 EDT
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.
Comment 28 RHEL Product and Program Management 2012-07-10 22:05:11 EDT
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.
Comment 29 Igor Mammedov 2012-07-31 05:18:50 EDT
qemu upstream fix that will allow to enable/disable S3/S4 on startup

commit 459ae5ea5ad682c2b3220beb244d4102c1a4e332
Author: Gleb Natapov <gleb@redhat.com>
Date:   Mon Jun 4 14:31:55 2012 +0300
   Add PIIX4 properties to control PM system states.
Comment 30 Igor Mammedov 2012-07-31 05:21:47 EDT
libvirt cunterpart bug 808463
Comment 32 Igor Mammedov 2012-09-18 11:48:24 EDT
Additional related libvirt bug 839661.

*** This bug has been marked as a duplicate of bug 827499 ***

Note You need to log in before you can comment on or make changes to this bug.