Bug 236595 - Guest Reboot Fails, 30 Second Shutdown Timeout
Guest Reboot Fails, 30 Second Shutdown Timeout
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: xen (Show other bugs)
5.0
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Daniel Berrange
:
: 248942 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-04-16 13:41 EDT by Devan Goodwin
Modified: 2007-11-30 17:07 EST (History)
2 users (show)

See Also:
Fixed In Version: RHEA-2007-0635
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-11-07 12:10:18 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Don't destroy guests on reboot timeout (1.24 KB, patch)
2007-07-19 15:41 EDT, Daniel Berrange
no flags Details | Diff

  None (edit)
Description Devan Goodwin 2007-04-16 13:41:19 EDT
When rebooting a guest on sufficiently slow hardware, guest shuts down but does
not come back.

Ticket filed with Xen: http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=967

Was suggested we create this ticket as a blocker for an RHN ticket to make sure
we don't lose track of the issue.

Details from the Xen ticket:

Encountered a problem where attempting to reboot guests with xm or virsh
resulted in the guest being destroyed and not coming back up. Found the
following xend.log entries:

[2007-04-16 11:07:15 xend.XendDomainInfo 2129] DEBUG (XendDomainInfo:940)
XendDomainInfo.handleShutdownWatch
[2007-04-16 11:07:15 xend.XendDomainInfo 2129] DEBUG (XendDomainInfo:940)
XendDomainInfo.handleShutdownWatch
[2007-04-16 11:07:45 xend.XendDomainInfo 2129] INFO (XendDomainInfo:930) Domain
shutdown timeout expired: name=sanjose id=5
[2007-04-16 11:07:45 xend.XendDomainInfo 2129] DEBUG (XendDomainInfo:1463)
XendDomainInfo.destroy: domid=5
[2007-04-16 11:07:45 xend.XendDomainInfo 2129] DEBUG (XendDomainInfo:1471)
XendDomainInfo.destroyDomain(5)

Shutdown timeout expires exactly 30 seconds after the first call to
handleShutdownWatch, and watching the guest console it appears the guest needs
just slightly more than 30 seconds to shutdown on the hardware in question.

Suspect a 30 second hard coded timeout which is likely too short.


How reproducible:

Depends on hardware, system in question was rlx-0-04.rhndev.redhat.com.
Comment 1 Daniel Berrange 2007-04-17 09:02:14 EDT
I searched for the 'shutdown timeout expired' message and found it in

./python/xen/xend/XendDomainInfo.py

It checks to see if the domain has been shutting down for > SHUTDOWN_TIMEOUT,
and if so kills it.

                if self.shutdownStartTime:
                    timeout = (SHUTDOWN_TIMEOUT - time.time() +
                               self.shutdownStartTime)
                    if timeout < 0:
                        log.info(
                            "Domain shutdown timeout expired: name=%s id=%s",
                            self.info['name'], self.domid)
                        self.destroy()


SHUTDOWN_TIMEOUT is set to '30' at the top of the file. I reckon we need to bump
this up to 60 seconds at least.
Comment 2 Clifford Perry 2007-04-18 08:39:32 EDT
Flagging the bug as proposed for RHEL 5.1. Seems like easy modification. 
Comment 3 RHEL Product and Program Management 2007-04-18 08:45:13 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 4 Daniel Berrange 2007-07-19 14:52:38 EDT
*** Bug 248942 has been marked as a duplicate of this bug. ***
Comment 5 Daniel Berrange 2007-07-19 15:37:42 EDT
Upstream Xen has removed the shutdown timer completely, allowing the admin to
deal with non-responsive guests as they see fit. They can run a 'destroy'
manually if desirable, or take other action.

changeset:   15179:152dc0d812b2
user:        kfraser@localhost.localdomain
date:        Wed May 30 10:06:23 2007 +0100
summary:     xend: Don't destroy domains on shutdown timeout.
Comment 6 Daniel Berrange 2007-07-19 15:41:14 EDT
Created attachment 159606 [details]
Don't destroy guests on reboot timeout

This patch is a copy of upstream code ported to RHEL-5 tree
Comment 8 Daniel Berrange 2007-08-27 18:52:57 EDT
Patch applied in:

* Mon Aug 27 2007 Daniel P. Berrange <berrange@redhat.com> - 3.0.3-37.el5
- Don't destroy guest after shutdown timeout (rhbz #236595)
Comment 11 errata-xmlrpc 2007-11-07 12:10:18 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2007-0635.html

Note You need to log in before you can comment on or make changes to this bug.