Bug 741375 - seabios should not expose S3 to guests
Summary: seabios should not expose S3 to guests
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: seabios
Version: 16
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Justin M. Forbes
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: RejectedBlocker RejectedNTH
Depends On:
Blocks: 744077
TreeView+ depends on / blocked
 
Reported: 2011-09-26 17:36 UTC by Eric Blake
Modified: 2011-10-16 00:56 UTC (History)
11 users (show)

Fixed In Version: seabios-0.6.2-3.fc16
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 744077 (view as bug list)
Environment:
Last Closed: 2011-10-16 00:56:08 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Eric Blake 2011-09-26 17:36:20 UTC
Description of problem:
When using libvirt to manage a guest, the preferred method for requesting guest shutdown from the host is the use of the virDomainShutdown API (exposed as the 'shutdown' option in virt-manager, or as 'virsh shutdown domain' from the shell, etc.).  However, this command consists of triggering an ACPI interrupt in the guest.  The default behavior of F16 on an ACPI interrupt is to enter S3 suspend mode, but this is pointless because qemu-kvm immediately re-wakes the system.  The default with F14 was to pop up the interactive restart/shutdown/cancel interactive box with a 60-second timeout that defaulted to shutdown, and thus virDomainShutdown in the host will cause an F14 guest to cleanly shutdown, but have no effect on an F16 guest.

Version-Release number of selected component (if applicable):
gnome-power-manager-3.1.92-1.fc16.x86_64
libvirt-0.9.6-1.fc16.x86_64
qemu-kvm-0.15.0-4.fc16.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Install of F16 on bare metal, including libvirt and qemu-kvm
2. Create a default-install F16 guest
3. Try to shutdown the guest from the host, such as by using 'virsh shutdown domain' in the host
  
Actual results:
In the guest, the eth0 connection bounces, which is evidence that the ACPI signal was received, and that the OS tried to go into S3 but immediately resumed operation

Expected results:
In the guest, the interactive shutdown box should appear.

Additional info:
Note that the guest itself can trigger shutdown - since KVM does not yet support 3d graphics, the default gnome use in the guest uses fallback mode, where clicking on the user name in the top right, then 'Shutdown...', pops up the interactive box, and that interactive box can indeed trigger a shutdown.  However, this is guest-initiated, not host-initiated.

Based on these beta blocker requirements from https://fedoraproject.org/wiki/Fedora_16_Beta_Release_Criteria:

14. The release must boot successfully as a virtual guest in a situation where the virtual host is running the same release (using Fedora's current preferred virtualization technology) 
21. All release-blocking desktops' offered mechanisms (if any) for shutting down, logging out and rebooting must work 

I argue that this is a beta blocker bug, since virDomainShutdown is the preferred and only offered mechanism for cleanly shutting down a guest from the host, and that this is a case of F16 as a self-hosted guest not obeying all the release requirements.

Comment 1 Richard Hughes 2011-09-26 18:01:02 UTC
(In reply to comment #0)
> The default with F14 was to pop up the interactive
> restart/shutdown/cancel interactive box with a 60-second timeout that defaulted
> to shutdown, and thus virDomainShutdown in the host will cause an F14 guest to
> cleanly shutdown, but have no effect on an F16 guest.

So you have to wait 60 seconds for the guest to shutdown? That sounds like it's using a hack that used to work in F14 (by co-incidence) than no longer works in F16 as the defaults have changed to something specified by UX designers.

> I argue that this is a beta blocker bug, since virDomainShutdown is the
> preferred and only offered mechanism for cleanly shutting down a guest from the
> host...

Then virDomainShutdown is buggy. You can't just expect to "inject" a power button press and hope that the guest shuts down in an ordered way.

If the user changes the behavior of the shutdown button in F14 or F15 to anything other than the default, then it's going to break there too. 

Really the power manager should be taught that it's running as a VM guest and do something sane (where sane is discussed by the UX people). If you provide some sample code in C and the g-s-d maintainers agree then this is probably the best course of action.

Richard.

Comment 2 Eric Blake 2011-09-26 18:18:10 UTC
vir(In reply to comment #1)
> So you have to wait 60 seconds for the guest to shutdown?

Yes, for out-of-the-box defaults in F14.

> That sounds like it's
> using a hack that used to work in F14 (by co-incidence) than no longer works in
> F16 as the defaults have changed to something specified by UX designers.
> 
> > I argue that this is a beta blocker bug, since virDomainShutdown is the
> > preferred and only offered mechanism for cleanly shutting down a guest from the
> > host...
> 
> Then virDomainShutdown is buggy. You can't just expect to "inject" a power
> button press and hope that the guest shuts down in an ordered way.

virDomainShutdown has _always_ been specified as relying on guest cooperation.  The same is true for Windows guests - if the guest is not configured to react to ACPI, then shutdown won't work (there have been numerous complaints about how windows defaults to treating ACPI as a shutdown request if someone is logged in, but ignoring it when done on the initial login screen, all of which have been marked as not a libvirt bug - for example, bug 738553).  In other words, the problem in this bug report is that the default ACPI reaction has changed, from something that worked for a clean shutdown, to something that now doesn't work, and not that ACPI is unreliable for a shutdown mechanism in the first place.

Libvirt _does_ have virDomainDestroy to forcefully shutdown a guest, but this is not clean from the guest's perspective (it is the same as yanking the power cord).

There is also talk of adding a guest agent, where the shutdown request can be sent via the agent rather than by overloading ACPI events, but the guest agent is apparently not mature enough yet for default inclusion in F16.

I have no problem with a change that would make default ACPI behavior depend on whether F16 is running as host (use the UX designer's new behavior) or guest (recognize that this is a guest, and therefore S3 is useless, and therefore shutdown is the only thing that makes sense, whether the shutdown is interactive after 60 seconds or instantaneous like the current S3 is instantaneous).

In fact, it may even be worth cloning this against qemu-kvm, to state that qemu should NOT be exposing S3 capabilities to the guest, so that the guest will no longer try to treat ACPI as an S3 request.  But _something_ needs to be done to make the default out-of-the-box behavior nicer.

> 
> If the user changes the behavior of the shutdown button in F14 or F15 to
> anything other than the default, then it's going to break there too. 

Yes, but then that's no longer the default.  The beta-blocker requirement is about sane out-of-the-box defaults, not what happens after the user configures things.  And since libvirt already documents that virDomainShutdown is best-effort and requires guest cooperation (whether via ACPI or via a guest agent command), the host must be prepared for guests that have reconfigured ACPI behavior.  But that doesn't change the question of starting from sane defaults.

> 
> Really the power manager should be taught that it's running as a VM guest and
> do something sane (where sane is discussed by the UX people). If you provide
> some sample code in C and the g-s-d maintainers agree then this is probably the
> best course of action.

The 'virt-what' package is a rough approximation of whether F16 is running as a VM guest.  Also, as I mentioned, it might be possible to teach qemu-kvm to quit advertising S3 to guests, at least until future Fedora we have a guest agent incorporated by default into guests.

Comment 3 Adam Williamson 2011-09-26 18:20:06 UTC
Voting -1 on blocker, this doesn't really hit any of our criteria. The relevant criteria are "The release must boot successfully as a virtual guest in a situation where the virtual host is running the same release (using Fedora's current preferred virtualization technology)" and (arguably) "All release-blocking desktops' offered mechanisms (if any) for shutting down, logging out and rebooting must work" but, hey, it does boot and run, and you can indeed shut down sanely from within the guest. 'Injecting' a shutdown action into the guest from the host is kind of extra credit stuff, to me; it clearly doesn't meet our current criteria and I'm comfortable with not adding a criterion for this.

Comment 4 Adam Williamson 2011-09-26 18:23:29 UTC
"Also, as I mentioned, it might be possible to teach qemu-kvm to quit
advertising S3 to guests"

this seems a correct solution, BTW. GNOME recognizes the system's advertised suspend capabilities: if it doesn't advertise suspend capability it offers a Shut Down... option in the menu rather than Suspend and I expect powers down on a power button press (rather than suspending). If qemu-kvm is not capable of suspending safely it should not advertise suspend capabilities. GNOME doesn't appear to be doing anything wrong here.

Comment 5 Tim Flink 2011-09-26 18:29:43 UTC
I'm also -1 beta blocker on this.

It doesn't directly hit our criteria (for the reasons listed in comment 3) and it could be fixed by updates later on.

Comment 6 Adam Williamson 2011-09-26 22:58:16 UTC
since we have two -1s on this i felt okay with requesting RC3 compose with it still open, but I'll wait for another vote before declaring it rejected.

Comment 7 Kamil Páral 2011-09-27 10:39:44 UTC
This is related to my bug 704467.

(In reply to comment #4)
> "Also, as I mentioned, it might be possible to teach qemu-kvm to quit
> advertising S3 to guests"
> 
> this seems a correct solution, BTW. GNOME recognizes the system's advertised
> suspend capabilities: if it doesn't advertise suspend capability it offers a
> Shut Down... option in the menu rather than Suspend and I expect powers down on
> a power button press (rather than suspending). If qemu-kvm is not capable of
> suspending safely it should not advertise suspend capabilities. GNOME doesn't
> appear to be doing anything wrong here.

This sounds like a good way to solve it. -1 blocker from me, +1 NTH (beta/final).

Comment 8 Tim Flink 2011-09-27 16:27:01 UTC
Okay, we've got -3 blocker so I'm moving it to rejected and proposing it as a final NTH.

Comment 9 Richard Hughes 2011-09-27 16:32:42 UTC
(In reply to comment #2)
> The 'virt-what' package is a rough approximation of whether F16 is running as a
> VM guest.

Is there anything in C, and runnable by an unprivileged user that will tell us we're in a VM?

Comment 10 Eric Blake 2011-09-27 17:09:53 UTC
(In reply to comment #9)
> (In reply to comment #2)
> > The 'virt-what' package is a rough approximation of whether F16 is running as a
> > VM guest.
> 
> Is there anything in C, and runnable by an unprivileged user that will tell us
> we're in a VM?

I know that there has been work on virt-what to make it work non-privileged, but not sure if it meets the needs just yet.

Comment 11 Richard Hughes 2011-09-27 17:20:19 UTC
(In reply to comment #10)
> I know that there has been work on virt-what to make it work non-privileged,
> but not sure if it meets the needs just yet.

There's no timer device or anything specific to kvm I can use? I'm not particularly worried about anything that's not the fedora default personally.

Richard

Comment 12 Richard W.M. Jones 2011-09-27 17:27:46 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > Is there anything in C, and runnable by an unprivileged user that will tell us
> > we're in a VM?
> 
> I know that there has been work on virt-what to make it work non-privileged,
> but not sure if it meets the needs just yet.

There is a BZ to make virt-what work as non-root:

https://bugzilla.redhat.com/show_bug.cgi?id=719611

(In reply to comment #11)
> There's no timer device or anything specific to kvm I can use? I'm not
> particularly worried about anything that's not the fedora default personally.

What you might want to do in the meantime is take a look at
the virt-what sources.  It's just a shell script!  Maybe
there will be some ideas in it you can use:

http://git.annexia.org/?p=virt-what.git;a=blob;f=virt-what.in;hb=HEAD

Comment 13 Adam Williamson 2011-09-30 20:11:14 UTC
Discussed at 2011-09-30 NTH review meeting. We can't see anything in particular that makes this issue NTH; it could be fixed quite well with a post-release update. We suppose if it wasn't fixed at release but was fixed post-release that would mean it would hit F16 live images in VMs, but still, it just seems too trivial to break a freeze for. Note there is a non-frozen period for a few weeks between Beta and Final freeze where this fix could be committed.

Comment 14 Eric Blake 2011-10-01 18:10:19 UTC
I finally figured out how to change this behavior from the default, but it is quite hidden.  Even a change to the GUIs (whether control-center or gnome-tweak-tool) to expose this would be a welcome help.

Install dconf-editor, then find org.gnome.settings-daemon.plugins.power, and change button-power from 'suspend' to either 'interactive' or 'shutdown'.

Comment 15 Eric Blake 2011-10-03 15:03:15 UTC
See also bug 736522 for teaching qemu/seabios how to quit advertising S3 to guests.

Comment 16 Fedora Update System 2011-10-06 02:35:00 UTC
seabios-0.6.2-3.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/seabios-0.6.2-3.fc16

Comment 17 Kamil Páral 2011-10-06 06:57:32 UTC
Can somebody please describe in detail what changed in seabios (what is the new expected default behavior) so that we can test it properly?

Comment 18 Eric Blake 2011-10-06 20:09:04 UTC
The seabios change makes it so that guests no longer see S3 advertised.  In your guest, run 'pm-is-supported --suspend' before and after the seabios upgrade; exit status will be 0 (supported) pre-upgrade, and 1 (absent) post-upgrade.

However, while the seabios change is good and working, the overall bug is still present.  Now it looks like when the F16 guest has no S3 support, but org.gnome.settings-daemon.plugins.power.button-power is still 'suspend', that the guest completely ignores ACPI.  But this was also without 'acpid' installed.  Maybe that means that we ALSO need to install acpid by default, and automatically enable it if we detect that we are in a VM?  Or is there still something needed in gnome-power that recognizes ACPI power without S3 as a reason to shutdown instead?

Comment 19 Fedora Update System 2011-10-06 21:22:42 UTC
Package seabios-0.6.2-3.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing seabios-0.6.2-3.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2011-13894
then log in and leave karma (feedback).

Comment 20 Eric Blake 2011-10-06 22:10:45 UTC
I split the remaining issues with F16 defaults still preventing out-of-the-box self-hosted host-initiated guest shutdown into bug 744077, so that the seabios fix is not held up by the solution to the remaining problems.

Comment 21 Fedora Update System 2011-10-16 00:56:08 UTC
seabios-0.6.2-3.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.