Bug 521799 - rename-restart behavior is broken
Summary: rename-restart behavior is broken
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: xen
Version: 5.4
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Jiri Denemark
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-09-08 11:34 UTC by Jiri Denemark
Modified: 2010-04-08 16:24 UTC (History)
3 users (show)

Fixed In Version: xen-3.0.3-103.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-03-30 08:59:04 UTC
Target Upstream Version:


Attachments (Terms of Use)
Trivial fix to make rename-restart work again (2.26 KB, patch)
2009-09-08 11:34 UTC, Jiri Denemark
no flags Details | Diff
Fix a race in rename-restart (6.22 KB, patch)
2010-01-06 14:08 UTC, Jiri Denemark
no flags Details | Diff


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2010:0294 normal SHIPPED_LIVE xen bug fix and enhancement update 2010-03-29 14:20:32 UTC

Description Jiri Denemark 2009-09-08 11:34:46 UTC
Created attachment 360058 [details]
Trivial fix to make rename-restart work again

Description of problem:

First, "rename-restart" behavior is completely broken as
preserveForRestart() method of XendDomainInfo class calls
self._removeVm() instead of self.removeVm(), which results in
AttributeError: XendDomainInfo instance has no attribute '_removeVm'

When a domain is configured for "rename-restart" behavior instead of
"restart", the old instance of the domain is renamed and preserved during
restarts. There is a bug in our xend code which breaks restart_count for
those domains. The counter is incremented for the old instance instead
of the new one. That is, the running instance would seem like it was
never restarted and older instances would have restart_count set to 1.
It works only by an accident for domains with "restart" behavior,
because both the old instance and the just created domain share the same
path in xenstore.

The bug was found during code inspection when fixing another bug.

Version-Release number of selected component (if applicable):

xen-3.0.3-94 (most likely every version since xen-3.0.3-76)

How reproducible:

100%

Steps to Reproduce:
1. create a domain with on_reboot = "rename-restart"
2. restart the domain
3. check /vm/UUID/xend/restart_count of the restarted domain
  
Actual results:

The domain remains shutdown.

Expected results:

The domain is successfully restarted and restart_count is increased by each restart.

Additional info:

Both parts of this bug were introduced by inaccurate backport of upstream cs 16977 in xen-3.0.3-76. Trivial fix attached.

Comment 6 Jiri Denemark 2010-01-04 09:04:36 UTC
Yes, that's correct, xenpv-1-1 is the renamed guest and it should be in state s. However, you should also see a new running xenpv-1 guest. Value of restart_count should only be increased in the new guest.

As you see the increased value of restart_count, I guess you just issued the second xm li too soon after xm reboot finished. You should wait a bit to give the new domain a chance to appear.

Comment 9 Jiri Denemark 2010-01-04 15:02:21 UTC
Hmm, funny that virsh can see it but xm doesn't. Is it missing in xm output even when you run xm list after you see the guest in virsh list?

Comment 10 Rita Wu 2010-01-05 02:42:18 UTC
(In reply to comment #9)
> Hmm, funny that virsh can see it but xm doesn't. Is it missing in xm output
> even when you run xm list after you see the guest in virsh list?  

Yes,it is only outputted in "virsh list".

Comment 11 Jiri Denemark 2010-01-05 07:45:17 UTC
(In reply to comment #10)
> Yes,it is only outputted in "virsh list".  

That's weird. Anyway if the guest is running after rebooting and you can see its console and ssh to it, it's most likely some unrelated issue.

To be sure, could you attach output of xenstore-ls and /var/log/xen/xend.log after you tried listing running guests with virsh list and xm list?

Thanks.

Comment 13 Jiri Denemark 2010-01-05 09:57:32 UTC
xenstore database doesn't seem to be in the best shape. Are you sure you rebooted the host after trying to reproduce this bug on older package version? To be 100% sure there are no leftovers in xenstore, please reboot the host and try it again.

Thanks.

Comment 14 Rita Wu 2010-01-05 10:25:57 UTC
(In reply to comment #13)
> xenstore database doesn't seem to be in the best shape. Are you sure you
> rebooted the host after trying to reproduce this bug on older package version?
> To be 100% sure there are no leftovers in xenstore, please reboot the host and
> try it again.
> 
> Thanks.  

Yes, I'm sure I rebooted xend. And I tried to reboot the host just now, but the result is the same: only virsh list can show the 2(old and new) domains.

Comment 15 Jiri Denemark 2010-01-05 10:47:26 UTC
Restarting xend is not enough in this case as it doesn't clean xenstore. Rebooting does, that's why I wanted you to reboot the host. Thanks for doing that.

Can you attach xenstore-ls and output of xm list and virsh list now when you rebooted the host?

Comment 17 Jiri Denemark 2010-01-06 11:12:20 UTC
There seems to be a race in rename-restart path similar to one we fixed in bug 358693 for normal restart. I'll respin the patch for this bug.

Comment 18 Jiri Denemark 2010-01-06 11:18:16 UTC
Ah, that was the attachment ID. I wanted to mention bug 513604 instead...

Comment 20 Jiri Denemark 2010-01-06 14:08:24 UTC
Created attachment 381996 [details]
Fix a race in rename-restart

Comment 27 errata-xmlrpc 2010-03-30 08:59:04 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0294.html

Comment 28 Paolo Bonzini 2010-04-08 15:41:37 UTC
This bug was closed during 5.5 development and it's being removed from the internal tracking bugs (which are now for 5.6).


Note You need to log in before you can comment on or make changes to this bug.