Bug 557446

Summary: Migration not working on some machines after xen-3.0.3-95.el5 disable paused domain save patch
Product: Red Hat Enterprise Linux 5 Reporter: Michal Novotny <minovotn>
Component: xenAssignee: Miroslav Rezanina <mrezanin>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 5.5CC: areis, jdenemar, llim, mrezanin, mshao, pbonzini, xen-maint
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: xen-3.0.3-105.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-03-30 08:59:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Fix domain migration on some machines
none
Refresh xend domain cache before migration/save patch none

Description Michal Novotny 2010-01-21 14:29:22 UTC
Description of problem:
On several machines migration is no longer possible since xen-3.0.3-95.el5 because it keeps saying that error that a paused guest can't be saved or shut down. It is reproducible only on some machines but not all of them...

Version-Release number of selected component (if applicable):
xen-3.0.3-95.el5
kernel-xen-2.6.18-183.el5

How reproducible:
100% on some machines only, maybe configuration dependent or something,
i.e. reproducible on xen-live1.usersys.redhat.com and xen-live2.usersys.redhat.com machines

Steps to Reproduce:
1. start both host A (e.g. xen-live1) and host B (e.g. xen-live2) using shared storage
2. boot a guest on host A
3. try to migrate a guest from host A to host B
  
Actual results:
An error "Can't shutdown/save the domain since the domain is paused;unpause it first if you want to shutdown/save" is thrown.

Expected results:
Domain is migrated successfully.

Additional info:
For migration/save we need to pause the guest first but after migration the domain is in paused state for a while. That's why the error is returned instead of successful migration.

Comment 2 Michal Novotny 2010-01-21 14:51:45 UTC
Created attachment 385934 [details]
Fix domain migration on some machines

I created a patch for this one. As written in comment #0 for migration/save we need to pause the guest first and that's why a better save paused domain patch had to be created to fix it. The patch attached basically disables save and shutdown of a paused domain but allows migrations (i.e. allows shutdown after domain suspend issued before migration) of guests on problematic systems as well.

Michal

Comment 10 Miroslav Rezanina 2010-02-04 12:19:34 UTC
Paused state is probably caused by not-up-to-date xend cache.

Comment 11 Miroslav Rezanina 2010-02-08 10:10:45 UTC
Created attachment 389487 [details]
Refresh xend domain cache before migration/save patch

As false alarm is cause by not up-to-date state of xend cache, we refresh it before migration/save.

Comment 17 Yewei Shao 2010-02-22 08:11:38 UTC
I can reproduce this bug in my own two system with the xen-3.0.3-104.el5 package 100%, I try 10 times and each time, the error message of "Can't shutdown/save the domain since the domain is paused;unpause it first if you want to shutdown/save" will pop up.  And with the newest package of xen-3.0.3-105.el5, I also try with 10 times, no more error message will pop up.

Here is my steps:
(1) start both host A and host B using shared storage
(2) Boot the guest in host A wait till guest is on, before migrate the guest, do not use any other command, that is without any other access to xend
(3) Migrate the guest from host A to host B

I can reproduce this problem and also with newest package, the problem is no more exist. Communicate with Miroslav Rezanina (mrezanin) in irc, although I do not test this bug in xen-live1.usersys.redhat.com and xen-live2.usersys.redhat.com (currently the xen-live1.usersys.redhat.com is down),but he think I can set this bug to verified based on my test result and no more need to test in xen-live system. So this bug is verified in xen-3.0.3-105.el5.

Comment 19 errata-xmlrpc 2010-03-30 08:59:25 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0294.html

Comment 20 Paolo Bonzini 2010-04-08 15:45:08 UTC
This bug was closed during 5.5 development and it's being removed from the internal tracking bugs (which are now for 5.6).