Bug 618569

Summary: RHEL6 PV guest hangs there after migration failed
Product: Red Hat Enterprise Linux 6 Reporter: YangGuang <gyang>
Component: kernelAssignee: Andrew Jones <drjones>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.0CC: drjones, leiwang, llim, mrezanin, mshao, pbonzini, qguan, qwan, rwu, stbechto, syeghiay, xen-maint, xinsun, yuzhang, yuzhou
Target Milestone: betaKeywords: TestOnly
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard: xen
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-04-18 07:55:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 497080    
Bug Blocks: 523117, 767187    
Attachments:
Description Flags
xend.log
none
configure file of pv
none
xend.log none

Description YangGuang 2010-07-27 09:57:13 UTC
Created attachment 434642 [details]
xend.log

Description of problem:
In the RHEL5.5 i386 platform,we cannot use vnc tool to view the RHEL6 beta2 snapshot8 i386 pv guest after migration failed. 

Version-Release number of selected component (if applicable):
kernel-xen-devel-2.6.18-208.el5
kernel-xen-2.6.18-208.el5
xen-3.0.3-114.el5
xen-devel-3.0.3-114.el5
xen-libs-3.0.3-114.el5

How reproducible:
always

Steps to Reproduce:
1.Change xend configration to enable migration and setup NFS storage for migration.
2.Copy PV domain images to shared NFS storage server.
3.Mount the NFS image directory on host-A.
4.Create the VM guest on source host-A:
  [host]#xm create $vm.cfg
5.In host-A, you can connect to guest via vncviewer successfully
  [host]#vncviewer 127.0.0.1:$port_number
6.Don't mount the NFS image directory on host-B
7.Execute command migrate from host-A to host-B.
  [host]#xm migrate $domid $ip_host-B
8.In host-A, connect to guest via vncviewer.
  
Actual results:
1.After step 8, after migration failed, guest doesn't keep its previous state and we cannot connect to guest via vnc.

Expected results:
1.After step 8, when migration failed, guest should keep its previous state and we can also connect to guest via vnc.


Additional info:
Please see attachment: 
xend.log
configure file of pv

Comment 1 YangGuang 2010-07-27 09:58:03 UTC
Created attachment 434643 [details]
configure file of pv

Comment 2 YangGuang 2010-07-28 02:43:45 UTC
In addition, we have tried to reproduce this bug with RHEL-5.4 PV guest. As a result, this bug didn't occur with RHEL-5.4 PV guest. So until now, this bug only applies to RHEL6-beta2-snapshot8 PV guest.

Comment 3 YangGuang 2010-07-28 02:53:03 UTC
What's more, we have deleted "vfb" in the configure file in RHEL6-beta2-snapshot8 PV guest (# vfb = ['type=vnc,vncunused=1,keymap=en-us,vnclisten=0.0.0.0' ]) to reproduce this bug.Under this situation, the PV guest would be in s(shutdown) state. 
While with "vfb" in the configure file, the PV guest would hang there in r(running) state.

Comment 6 RHEL Program Management 2010-07-28 08:17:40 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 9 YangGuang 2010-09-26 04:47:03 UTC
In x86-64 host, after x86-64 guest migration failed, a zombie guest occurs and the guest will restart.

[host]# xm li
 Name                                      ID Mem(MiB) VCPUs State   Time(s)
Domain-0                                   0     6854     4 r-----    587.6
Zombie-xen-pv-guest                        5     1024     1 --p-cd     24.2
xen-pv-guest                               6     1024     2 -b----     32.8

Comment 10 YangGuang 2010-09-26 04:57:18 UTC
Created attachment 449684 [details]
xend.log

Comment 11 YangGuang 2010-09-26 05:12:25 UTC
Version-Release number of selected component:
kernel-xen-devel-2.6.18-223.el5
kernel-xen-2.6.18-223.el5
xen-3.0.3-116.el5
xen-devel-3.0.3-116.el5
xen-libs-3.0.3-116.el5

Comment 12 RHEL Program Management 2011-01-07 04:03:02 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 13 Suzanne Logcher 2011-01-07 16:09:09 UTC
This request was erroneously denied for the current release of Red Hat
Enterprise Linux.  The error has been fixed and this request has been
re-proposed for the current release.

Comment 14 RHEL Program Management 2011-02-01 05:37:40 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 15 RHEL Program Management 2011-02-01 18:30:36 UTC
This request was erroneously denied for the current release of
Red Hat Enterprise Linux.  The error has been fixed and this
request has been re-proposed for the current release.

Comment 16 Paolo Bonzini 2011-04-20 12:36:19 UTC
*** Bug 615187 has been marked as a duplicate of this bug. ***

Comment 17 Paolo Bonzini 2011-04-20 13:19:20 UTC
Even with suspend cancellation, the following patches may be needed:

c7853ae (xen: xenbus PM events support)
b3e96c0 (xen: use freeze/restore/thaw PM events for suspend/resume/chkpt)

Comment 22 Andrew Jones 2011-12-08 16:28:46 UTC
I'll go ahead and dev-ack this, but I think it'll come down to just doing some tests and seeing if there's any surprises, i.e. if something breaks that wouldn't have been solved with suspend-cancellation. Assuming we don't find anything, then this bug will get closed as CANTFIX.

Comment 23 RHEL Program Management 2011-12-13 04:40:37 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 24 Andrew Jones 2012-02-28 16:07:01 UTC
Since this is TestOnly I'm switching it to MODIFIED in order to get it onto QA.

Comment 26 Andrew Jones 2012-04-18 07:55:40 UTC
In the end, I think we'll just WONTFIX this bug. We don't have suspend-cancellation anyway, and disk space is now checked in the tools, so I think we're pretty safe to just ignore any other possible cases at this stage.