User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 Attempting to save a Xen Fedora 12 guest (kernel versions kernel 2.6.31.12-174.2.19.fc12.i686.PAE and 2.6.31.12-174.2.22.fc12.i686.PAE) ends up with the save process stalling and the guest seemingly hanging - no response to pings, ssh sessions timeout and attaching to the console (xm console <domain> and xm create <domain> -c, following the entire boot process up to being presented with the login prompt) and typing stuff does not result in anything appearing - ie. it's frozen. xm list shows that the guest name has "migrating-" in front of it. (If the guest name is fedora, then xm list will show migrating-fedora.) However, other (CentOS 5.4) PV guests can be saved/resumed as per normal, and shutting the guest down normally (with shutdown -h now or init 0) works fine. Reproducible: Always Steps to Reproduce: 1. Create a Fedora 12 PV guest. (My config is 20GB disk - LV, 512MB RAM, bridged networking) 2. Run xm save <domain> <savefilepath> 3. Watch it freeze up. Actual Results: Output in xend.log: [2010-02-20 21:12:39 xend 2826] DEBUG (XendCheckpoint:89) [xc_save]: /usr/lib/xen/bin/xc_save 22 7 0 0 0 [2010-02-20 21:12:39 xend 2826] DEBUG (XendCheckpoint:324) suspend [2010-02-20 21:12:39 xend 2826] DEBUG (XendCheckpoint:92) In saveInputHandler suspend [2010-02-20 21:12:39 xend 2826] DEBUG (XendCheckpoint:94) Suspending 7 ... [2010-02-20 21:12:39 xend.XendDomainInfo 2826] DEBUG (XendDomainInfo:1249) XendDomainInfo.handleShutdownWatch [2010-02-20 21:12:39 xend.XendDomainInfo 2826] DEBUG (XendDomainInfo:1249) XendDomainInfo.handleShutdownWatch Expected Results: Output of a working CentOS5.4 guest: [2010-02-20 21:27:30 xend 2826] DEBUG (XendCheckpoint:89) [xc_save]: /usr/lib/xen/bin/xc_save 26 2 0 0 0 [2010-02-20 21:27:30 xend 2826] DEBUG (XendCheckpoint:324) suspend [2010-02-20 21:27:30 xend 2826] DEBUG (XendCheckpoint:92) In saveInputHandler suspend [2010-02-20 21:27:30 xend 2826] DEBUG (XendCheckpoint:94) Suspending 2 ... [2010-02-20 21:27:30 xend.XendDomainInfo 2826] DEBUG (XendDomainInfo:1249) XendDomainInfo.handleShutdownWatch [2010-02-20 21:27:30 xend.XendDomainInfo 2826] DEBUG (XendDomainInfo:1249) XendDomainInfo.handleShutdownWatch [2010-02-20 21:27:30 xend.XendDomainInfo 2826] INFO (XendDomainInfo:1206) Domain has shutdown: name=migrating-centos5 id=2 reason=suspend. [2010-02-20 21:27:30 xend 2826] INFO (XendCheckpoint:99) Domain 2 suspended. [2010-02-20 21:27:30 xend 2826] DEBUG (XendCheckpoint:108) Written done [2010-02-20 21:27:30 xend 2826] INFO (XendCheckpoint:353) Had 0 unexplained entries in p2m table 1: sent 131021, skipped 0, delta 9553ms, dom0 57%, target 0%, sent 449Mb/s, dirtied 0Mb/s 0 pages [2010-02-20 21:27:40 xend 2826] INFO (XendCheckpoint:353) Total pages sent= 131021 (0.98x) [2010-02-20 21:27:40 xend 2826] INFO (XendCheckpoint:353) (of which 0 were fixups) [2010-02-20 21:27:40 xend 2826] INFO (XendCheckpoint:353) All memory is saved [2010-02-20 21:27:41 xend 2826] INFO (XendCheckpoint:353) Save exit rc=0 [2010-02-20 21:27:41 xend.XendDomainInfo 2826] DEBUG (XendDomainInfo:2130) XendDomainInfo.destroy: domid=2 [2010-02-20 21:27:42 xend.XendDomainInfo 2826] INFO (XendDomainInfo:2291) Dev 51712 still active, looping... [2010-02-20 21:27:42 xend.XendDomainInfo 2826] INFO (XendDomainInfo:2291) Dev 51712 still active, looping... [2010-02-20 21:27:42 xend.XendDomainInfo 2826] INFO (XendDomainInfo:2291) Dev 51712 still active, looping... [2010-02-20 21:27:42 xend.XendDomainInfo 2826] INFO (XendDomainInfo:2291) Dev 51712 still active, looping... [2010-02-20 21:27:42 xend.XendDomainInfo 2826] INFO (XendDomainInfo:2291) Dev 51712 still active, looping... [2010-02-20 21:27:42 xend.XendDomainInfo 2826] INFO (XendDomainInfo:2291) Dev 51712 still active, looping... [2010-02-20 21:27:42 xend.XendDomainInfo 2826] INFO (XendDomainInfo:2291) Dev 51712 still active, looping... [2010-02-20 21:27:42 xend.XendDomainInfo 2826] INFO (XendDomainInfo:2291) Dev 51712 still active, looping... [2010-02-20 21:27:42 xend.XendDomainInfo 2826] DEBUG (XendDomainInfo:2055) UUID Created: True [2010-02-20 21:27:42 xend.XendDomainInfo 2826] DEBUG (XendDomainInfo:2056) Devices to release: [], domid = 2 [2010-02-20 21:27:42 xend.XendDomainInfo 2826] DEBUG (XendDomainInfo:2068) Releasing PVFB backend devices ... yum [root@dom0 ~]# xm info host : dom0 release : 2.6.18-164.11.1.el5xen version : #1 SMP Wed Jan 20 08:53:10 EST 2010 machine : i686 nr_cpus : 4 nr_nodes : 1 sockets_per_node : 2 cores_per_socket : 1 threads_per_core : 2 cpu_mhz : 2392 hw_caps : bfebfbff:00000000:00000000:00000080:00004400 total_memory : 3071 free_memory : 990 node_to_cpu : node0:0-3 xen_major : 3 xen_minor : 1 xen_extra : .2-164.11.1.el5 xen_caps : xen-3.0-x86_32p xen_pagesize : 4096 platform_params : virt_start=0xf5800000 xen_changeset : unavailable cc_compiler : gcc version 4.1.2 20080704 (Red Hat 4.1.2-46) cc_compile_by : mockbuild cc_compile_domain : centos.org cc_compile_date : Wed Jan 20 07:31:16 EST 2010 xend_config_format : 2 Destroying the guest (xm destroy <domain>) makes this appear in xend.log: [2010-02-20 21:44:20 xend 2826] INFO (XendCheckpoint:99) Domain 7 suspended. [2010-02-20 21:44:20 xend 2826] DEBUG (XendCheckpoint:108) Written done [2010-02-20 21:44:20 xend 2826] INFO (XendCheckpoint:353) ERROR Internal error: domain is dying [2010-02-20 21:44:20 xend 2826] INFO (XendCheckpoint:353) ERROR Internal error: Domain appears not to have suspended [2010-02-20 21:44:20 xend 2826] INFO (XendCheckpoint:353) Save exit rc=1 [2010-02-20 21:44:20 xend 2826] ERROR (XendCheckpoint:133) Save failed on domain fedora (7). Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 110, in save forkHelper(cmd, fd, saveInputHandler, False) File "/usr/lib/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 341, in forkHelper raise XendError("%s failed" % string.join(cmd)) XendError: /usr/lib/xen/bin/xc_save 22 7 0 0 0 failed [2010-02-20 21:44:20 xend.XendDomainInfo 2826] DEBUG (XendDomainInfo:2165) XendDomainInfo.resumeDomain(7) [2010-02-20 21:44:20 xend.XendDomainInfo 2826] DEBUG (XendDomainInfo:2178) XendDomainInfo.resumeDomain: devices released [2010-02-20 21:44:20 xend.XendDomainInfo 2826] ERROR (XendDomainInfo:2220) Exception in evtcnh_reset(7) Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2218, in _resetChannels return xc.evtchn_reset(dom = self.domid) Error: (1, 'Internal error', 'do_evtchn_op: HYPERVISOR_event_channel_op failed: -1') [2010-02-20 21:44:20 xend.XendDomainInfo 2826] ERROR (XendDomainInfo:2195) XendDomainInfo.resume: xc.domain_resume failed on domain 7. Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2180, in resumeDomain self._resetChannels() File "/usr/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2218, in _resetChannels return xc.evtchn_reset(dom = self.domid) Error: (1, 'Internal error', 'do_evtchn_op: HYPERVISOR_event_channel_op failed: -1') [2010-02-20 21:44:20 xend 2826] DEBUG (XendCheckpoint:136) XendCheckpoint.save: resumeDomain Note: This bug (https://bugzilla.redhat.com/show_bug.cgi?id=523971) seems to be very similar, except that it's against a later kernel and rawhide.
Started to add the bit about yum, went to make sure yum did in fact update everything, and forgot about finishing my thought. yum check-update shows no new updates after updating the kernel to the .22 release. Also, the output in "actual behaviour" just ends there. No "Domain has shutdown" message like there is for the CentOS5.4 guest.
Later kernels (at least starting with 2.6.32.7, and maybe earlier) can save and restore properly. There was some interest in upstream xen development to backport whatever the necessary patch set is to the 2.6.31 stable tree, but I'm not sure anybody is currently working on it.
(In reply to comment #2) > Later kernels (at least starting with 2.6.32.7, and maybe earlier) can save and > restore properly. There was some interest in upstream xen development to > backport whatever the necessary patch set is to the 2.6.31 stable tree, but I'm > not sure anybody is currently working on it. I presume the later kernels are available in rawhide? Now's as a time as any to try it, I guess. *is off to set up a PV rawhide guest*
(In reply to comment #2) > Later kernels (at least starting with 2.6.32.7, and maybe earlier) can save and > restore properly. There was some interest in upstream xen development to > backport whatever the necessary patch set is to the 2.6.31 stable tree, but I'm > not sure anybody is currently working on it. Uh... strange. Kernel version 2.6.33-0.48.rc8.git1.fc14.i686.PAE can't be saved either. Next thing I'm trying is an install of F13, not rawhide. And if that doesn't work, I'll try F11, see if it's a regression or not.
I'm duping this bz to another bz opened for the same issue. They were both opened about the same time, but the other one has some more testing details. *** This bug has been marked as a duplicate of bug 566930 ***
(In reply to comment #5) > I'm duping this bz to another bz opened for the same issue. They were both > opened about the same time, but the other one has some more testing details. > > *** This bug has been marked as a duplicate of bug 566930 *** Yeah, there is more testing info so it makes sense to. I'll keep an eye on it, because rawhide seems to have the same problem, which doesn't make much sense (to me) where it's 2.6.33, so I would have thought that the bug would have been fixed there too.