Bug 512300

Summary: xen pv domain suspended when doing local migration
Product: Red Hat Enterprise Linux 5 Reporter: Edward Wang <edwang>
Component: xenAssignee: Michal Novotny <minovotn>
Status: CLOSED WORKSFORME QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.4CC: areis, clalance, dmair, llim, minovotn, xen-maint, yshao
Target Milestone: rc   
Target Release: 5.6   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-06-09 09:26:26 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 514500    
Attachments:
Description Flags
localmig-xm.sh (shell script to migrate pv domain in stress)
none
rhel5u4pv (config file for domain creation)
none
xend.log (xend log file of physical host) none

Description Edward Wang 2009-07-17 07:26:13 UTC
Description of problem:
During verification of bug 428691, I created a para-virtualized domain and
do stress local migration by issuing one shell script which issues command "xm migrate <domU> localhost" in stress, find that the domain turns into suspend status during 1st time migration, then in the succeeding migration process (actually the 2nd time), the migration process hang there without response anymore and from virt-manager, we can find that the domain's status switch between "running" and "paused" again and again.

Version-Release number of selected component (if applicable):
kernel: 2.6.18-157.el5xen
xen: 3.0.3-90-el5
libvirt: 0.6.3-15.el5

How reproducible:
every time

Steps to Reproduce:
1. create one para-virtualized domain by "xm create rhel5u4pv maxmem=1024
memory=1024" (rhel5u4pv is attached)
2. after domain rhel5u4pv booted up, start up the migration process by issuing
command "sh localmig-xm rhel5u4pv 3", where: "localmig-xm" is a shell script
which will do local migration to a specified domain in stress; you can change
"3" to any number you preferred to. (localmig-xm is attached)
  
Actual results:
1, the pv domain turns into "Paused" status after 1st local migration
2, during 2nd time migration, since domain is in paused status, the migration hang there without response anymore and its state switches between "running" and "paused" again and again (get this info form virt-manager)

Expected results:
1, pv domain migration results in domain turns into paused status
2, when a domain is in paused status, domain migration should not be allowed.

Additional info:
1, I KNOW, maybe local migration seems make no sense at all, but this really exists in our product. Since it exists, we should make it as a workable feature.
2, If local migration issue is not going to be fixed, at least, WE SHOULD DEAL WITH MIGRATION OPERATION WHEN DOMAIN IN PAUSED STATUS WELL. :)
3, xend.log is also attached (xend.log)

Thanks
Edward

Comment 1 Edward Wang 2009-07-17 07:27:49 UTC
Created attachment 354106 [details]
localmig-xm.sh (shell script to migrate pv domain in stress)

Comment 2 Edward Wang 2009-07-17 07:29:25 UTC
Created attachment 354108 [details]
rhel5u4pv (config file for domain creation)

Comment 3 Edward Wang 2009-07-17 07:30:09 UTC
Created attachment 354109 [details]
xend.log (xend log file of physical host)

Comment 9 Michal Novotny 2010-05-07 09:37:37 UTC
(In reply to comment #3)
> Created an attachment (id=354109) [details]
> xend.log (xend log file of physical host)    

I've been looking to the xend.log file and I have to ask - is it a full log? The log shouldn't end with "[xc_restore]: /usr/lib64/xen/bin/xc_restore 25 26 1 2 0 0 0" line so I don't see any exception or anything by now. Was the log taken after the domain got suspended? Could you provide `ps aux | grep qemu` output please? Also, could you try with -107.el5 RPM please?

Thanks,
Michal

Comment 10 Michal Novotny 2010-06-04 10:29:35 UTC
(In reply to comment #9)
> (In reply to comment #3)
> > Created an attachment (id=354109) [details] [details]
> > xend.log (xend log file of physical host)    
> 
> I've been looking to the xend.log file and I have to ask - is it a full log?
> The log shouldn't end with "[xc_restore]: /usr/lib64/xen/bin/xc_restore 25 26 1
> 2 0 0 0" line so I don't see any exception or anything by now. Was the log
> taken after the domain got suspended? Could you provide `ps aux | grep qemu`
> output please? Also, could you try with -107.el5 RPM please?
> 
> Thanks,
> Michal    

I did try it again, now using the xen-3.0.3-110.el5virttest28 version and also RHEL 5.5 version of xen package, i.e. -105.el5_5.2 packages. and it was working fine to migrate and I was unable to reproduce it. Could you please try to reproduce using the latest RHEL 5.5 version of xen package? If it's still in issue in your environment could you attach a log with information from this incident?

Thanks a lot,
Michal

Comment 11 Michal Novotny 2010-06-09 09:26:26 UTC
Since I was unable to reproduce it and nobody replied about this even when I wrote e-mail directly I'm closing this as WORKSFORME. If you run into this issue, feel free to reopen. Also, put the exact steps to reproduce it in the comment.

Michal