Description of problem: xc_save reports success although there are errors printed to xend.log and resulting image cannot be used for resuming. Save/restore feature is known to be broken and tracked by https://bugzilla.redhat.com/show_bug.cgi?id=437348 and xc_save shouldn't report success in that case. However this won't be fixed in 5.4 and Chris Lalancette suggested we should made xc_save fail for IA64... This is shown in xend.log when saving an IA64 guest: [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page 28 gpfn 28: Invalid argument [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page 29 gpfn 29: Invalid argument [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page 2a gpfn 2a: Invalid argument [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page 2b gpfn 2b: Invalid argument [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page 2c gpfn 2c: Invalid argument [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page 2d gpfn 2d: Invalid argument [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page 2e gpfn 2e: Invalid argument [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page 2f gpfn 2f: Invalid argument [2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) All memory is saved [2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) ip=a00000020088daa0, b0=a000000200898150 [2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) Save exit rc=0 Version-Release number of selected component (if applicable): xen-3.0.3-86.el5 How reproducible: always Steps to Reproduce: 1. create IA64 HVM guest 2. xm save guest Actual results: reports success Expected results: failure as it doesn't work Additional info: First discussed in https://bugzilla.redhat.com/show_bug.cgi?id=451675 comments #31 to #35.
We dropped the ball on this one. Still something I think we should do, but we'll defer it to 5.5. Chris Lalancette
(In reply to comment #0) > Description of problem: > > xc_save reports success although there are errors printed to xend.log and > resulting image cannot be used for resuming. Save/restore feature is known to > be broken and tracked by https://bugzilla.redhat.com/show_bug.cgi?id=437348 and > xc_save shouldn't report success in that case. However this won't be fixed in > 5.4 and Chris Lalancette suggested we should made xc_save fail for IA64... > > This is shown in xend.log when saving an IA64 guest: > > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page > 28 gpfn 28: Invalid argument > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page > 29 gpfn 29: Invalid argument > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page > 2a gpfn 2a: Invalid argument > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page > 2b gpfn 2b: Invalid argument > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page > 2c gpfn 2c: Invalid argument > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page > 2d gpfn 2d: Invalid argument > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page > 2e gpfn 2e: Invalid argument > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page > 2f gpfn 2f: Invalid argument > [2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) All memory is saved > [2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) ip=a00000020088daa0, > b0=a000000200898150 > [2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) Save exit rc=0 > > > Version-Release number of selected component (if applicable): > > xen-3.0.3-86.el5 > > How reproducible: > > always > > Steps to Reproduce: > 1. create IA64 HVM guest > 2. xm save guest > > > Actual results: > > reports success > > > Expected results: > > failure as it doesn't work > > Additional info: > > First discussed in https://bugzilla.redhat.com/show_bug.cgi?id=451675 comments > #31 to #35. What does it do when you try to restore the image created ? Michal
(In reply to comment #3) > (In reply to comment #0) > > Description of problem: > > > > xc_save reports success although there are errors printed to xend.log and > > resulting image cannot be used for resuming. Save/restore feature is known to > > be broken and tracked by https://bugzilla.redhat.com/show_bug.cgi?id=437348 and > > xc_save shouldn't report success in that case. However this won't be fixed in > > 5.4 and Chris Lalancette suggested we should made xc_save fail for IA64... > > > > This is shown in xend.log when saving an IA64 guest: > > > > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page > > 28 gpfn 28: Invalid argument > > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page > > 29 gpfn 29: Invalid argument > > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page > > 2a gpfn 2a: Invalid argument > > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page > > 2b gpfn 2b: Invalid argument > > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page > > 2c gpfn 2c: Invalid argument > > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page > > 2d gpfn 2d: Invalid argument > > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page > > 2e gpfn 2e: Invalid argument > > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page > > 2f gpfn 2f: Invalid argument > > [2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) All memory is saved > > [2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) ip=a00000020088daa0, > > b0=a000000200898150 > > [2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) Save exit rc=0 > > > > > > Version-Release number of selected component (if applicable): > > > > xen-3.0.3-86.el5 > > > > How reproducible: > > > > always > > > > Steps to Reproduce: > > 1. create IA64 HVM guest > > 2. xm save guest > > > > > > Actual results: > > > > reports success > > > > > > Expected results: > > > > failure as it doesn't work > > > > Additional info: > > > > First discussed in https://bugzilla.redhat.com/show_bug.cgi?id=451675 comments > > #31 to #35. > > What does it do when you try to restore the image created ? > > Michal Oh, sorry, I was reading the discussion on bug 451675 and we should definitely fix it since the restore is failing. I'd recommend just a simple fix to go out instead of continue, i.e. diff --git a/tools/libxc/ia64/xc_ia64_linux_save.c b/tools/libxc/ia64/xc_ia64_linux_save.c index 00e71fe..5204657 100644 --- a/tools/libxc/ia64/xc_ia64_linux_save.c +++ b/tools/libxc/ia64/xc_ia64_linux_save.c @@ -356,7 +356,7 @@ xc_domain_save(int xc_handle, int io_fd, uint32_t dom, uint32_t max_iters, FIXME: to be tracked. */ fprintf(stderr, "cannot map mfn page %lx gpfn %lx: %s\n", page_array[N], N, safe_strerror(errno)); - continue; + goto out; } if (!write_exact(io_fd, &N, sizeof(N))) { It's not been tested yet but I'm reserving a machine to test it on. Any objections about the patch? Michal
Created attachment 424864 [details] Fix error reporting for ia64 save Well, this is the patch to make it report failure if xc_save fails. It's been tested and working fine. I've built it into RPMs available at http://people.redhat.com/minovotn/xen/bz504278 Sakai, could you please test using those RPMs ? The very same version of this patch has been sent upstream for review. Thanks, Michal
Masayoshi, I've been trying to add contact to Sakai in needinfo but I've noticed it was not added at all so I'm changing needinfo contact to you. Could you please provide me the information from the testing or talk to Sakai about the issue? Thanks, Michal
Created attachment 442343 [details] entire xend.log Tested on IA64 machine (kernel-xen-2.6.18-211.el5, xen-3.0.3-115.el5) with IA64 hvm guest. "xm save" failed, but the guest also hung at the same time. Part of xend.log: [2010-09-01 02:38:12 xend 8131] DEBUG (XendCheckpoint:144) [xc_save]: /usr/lib/xen/bin/xc_save 24 36 0 0 4 [2010-09-01 02:38:12 xend 8131] DEBUG (XendCheckpoint:403) suspend [2010-09-01 02:38:12 xend 8131] DEBUG (XendCheckpoint:147) In saveInputHandler suspend [2010-09-01 02:38:12 xend 8131] DEBUG (XendCheckpoint:149) Suspending 36 ... [2010-09-01 02:38:12 xend.XendDomainInfo 8131] DEBUG (XendDomainInfo:1257) XendDomainInfo.handleShutdownWatch [2010-09-01 02:38:12 xend.XendDomainInfo 8131] DEBUG (XendDomainInfo:1257) XendDomainInfo.handleShutdownWatch [2010-09-01 02:38:12 xend.XendDomainInfo 8131] INFO (XendDomainInfo:1214) Domain has shutdown: name=migrating-rhel5u5-ia64-hvm id=36 reason=suspend. [2010-09-01 02:38:12 xend 8131] INFO (XendCheckpoint:154) Domain 36 suspended. [2010-09-01 02:38:12 xend 8131] INFO (XendCheckpoint:159) _releaseDevices for hvm domain [2010-09-01 02:38:12 xend 8131] INFO (image:482) use sigusr1 to signal qemu 8167 [2010-09-01 02:38:14 xend 8131] DEBUG (XendCheckpoint:163) Written done [2010-09-01 02:38:14 xend 8131] INFO (XendCheckpoint:452) cannot map mfn page 28 gpfn 28: Invalid argument [2010-09-01 02:38:14 xend 8131] INFO (XendCheckpoint:452) Save exit rc=1 [2010-09-01 02:38:14 xend 8131] DEBUG (XendCheckpoint:428) Domain 36 renamed back to rhel5u5-ia64-hvm [2010-09-01 02:38:14 xend.XendDomainInfo 8131] DEBUG (XendDomainInfo:1257) XendDomainInfo.handleShutdownWatch [2010-09-01 02:38:14 xend 8131] INFO (image:434) spawning device models: /usr/lib/xen/bin/qemu-dm ['/usr/lib/xen/bin/qemu-dm', '-d', '36', '-m', '1024', '-boot', 'd', '-vcpus', '2', '-acpi', '-usbdevice', 'tablet', '-k', 'en-us', '-domain-name', 'rhel5u5-ia64-hvm', '-vnc', '0.0.0.0:36', '-vncunused', '-loadvm', '/var/lib/xen/qemu-save-36.img'] [2010-09-01 02:38:14 xend 8131] INFO (image:437) device model pid: 8314 [2010-09-01 02:38:14 xend 8131] ERROR (XendCheckpoint:190) Save failed on domain rhel5u5-ia64-hvm (36). Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 167, in save forkHelper(cmd, fd, saveInputHandler, False, dominfo) File "/usr/lib/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 440, in forkHelper raise XendError("%s failed" % string.join(cmd)) XendError: /usr/lib/xen/bin/xc_save 24 36 0 0 4 failed [2010-09-01 02:38:14 xend.XendDomainInfo 8131] DEBUG (XendDomainInfo:2275) XendDomainInfo.resumeDomain(36) [2010-09-01 02:38:14 xend.XendDomainInfo 8131] ERROR (XendDomainInfo:2303) XendDomainInfo.resume: xc.domain_resume failed on domain 36. Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2300, in resumeDomain xc.domain_resume(self.domid, fast) Error: (38, 'Function not implemented') [2010-09-01 02:38:14 xend 8131] DEBUG (XendCheckpoint:194) XendCheckpoint.save: resumeDomain
(In reply to comment #17) > Created attachment 442343 [details] > entire xend.log > > Tested on IA64 machine (kernel-xen-2.6.18-211.el5, xen-3.0.3-115.el5) with IA64 > hvm guest. > > "xm save" failed, but the guest also hung at the same time. > Well, this is exactly what I saw there. Of course, the behaviour is not correct since it shouldn't hang nevertheless it appears to be a kernel-xen bug in the PV domU since the guest is PV guest. Michal
Well, I was talking about the original bug nevertheless testing in comment #17 is about HVM guest and this one is about save functionality. Save functionality is not implemented for HVM guests on ia64 platform AFAIK so I guess we should create a patch to disable save on IA64 HVM guest. The guest is being stuck since it's trying to call resume functionality which would require many backports from upstream so I guess fixing this in the user-space stack to disallow is a good idea. The patch to disallow save on IA64 HVM guest even before anything is done to the guest will be coming soon. Michal
This is the test version of my patch to disable save/migrate of HVM guest on ia64 platform: diff --git a/tools/python/xen/xend/XendDomain.py b/tools/python/xen/xend/XendDomain.py index 2dcfe3c..447b9cc 100644 --- a/tools/python/xen/xend/XendDomain.py +++ b/tools/python/xen/xend/XendDomain.py @@ -34,6 +34,7 @@ import XendDomainInfo from xen.xend import XendRoot from xen.xend import XendCheckpoint +from xen.xend import arch from xen.xend.XendError import XendError, XendInvalidDomain from xen.xend.XendLogging import log from xen.xend.xenstore.xstransact import xstransact @@ -489,6 +490,9 @@ class XendDomain: raise XendError("Can't migrate the domain since the domain is paused; " "unpause it first if you want to migrate it") + if arch.type == "ia64" and dominfo.is_hvm(): + raise XendError("Migrating HVM guest on ia64 platform is not implemented") + """ The following call may raise a XendError exception """ dominfo.testMigrateDevices(True, dst) @@ -536,6 +540,9 @@ class XendDomain: raise XendError("Can't save the domain since the domain is paused; " "unpause it first if you want to save it") + if arch.type == "ia64" and dominfo.is_hvm(): + raise XendError("Saving HVM guest on ia64 platform is not implemented") + fd = os.open(dst, os.O_WRONLY | os.O_CREAT | os.O_TRUNC) oshelp.close_on_exec(fd) try: Michal
See bug 622413 comment #9 to #15.
Created attachment 447697 [details] Patch to disable ia64 HVM save/migrate call since it's not implemented Hi, this is the patch to disable calling save/migrate on ia64 HVM guests since it's not implemented in RHEL-5 version of Xen so this patch makes it bail with error before anything is being done to the guest. Testing: The patch has been tested on x86 host with x86 HVM guest with changed arch.type comparison to 'x86' instead of 'ia64' since I was unable to get access to ia64 machine. Before my patch applied it tried to save or migrate the guest but with this patch applied it bails with error that migrating or saving HVM guests on ia64 is not implemented and the guest is still running (i.e. it's not being hang). Michal
Fix built into xen-3.0.3-117.el5
Verified on RHEL5.5-Server-20100322.0-ia64 host, RHEL-Server-5.5-hvm-ia64 guest. "xm save" returns expected error message, and the guest still works well. ----------------------- # xm li Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 3996 2 r----- 16414.4 vm3 27 1047 1 -b---- 1943.7 # xm save vm3 vm3.save Error: Saving HVM guest on ia64 is not implemented Usage: xm save <Domain> <CheckpointFile> Save a domain state to restore later.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0031.html