Bug 504278 - xc_save reports success on IA64 although errors are printed to xend.log
xc_save reports success on IA64 although errors are printed to xend.log
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: xen (Show other bugs)
5.4
ia64 Linux
low Severity medium
: rc
: 5.6
Assigned To: Michal Novotny
Virtualization Bugs
:
Depends On:
Blocks: 514500
  Show dependency treegraph
 
Reported: 2009-06-05 08:15 EDT by Jiri Denemark
Modified: 2014-02-02 17:37 EST (History)
11 users (show)

See Also:
Fixed In Version: xen-3.0.3-117.el5
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-01-13 17:17:24 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Fix error reporting for ia64 save (1.27 KB, patch)
2010-06-17 11:53 EDT, Michal Novotny
no flags Details | Diff
Patch to disable ia64 HVM save/migrate call since it's not implemented (2.49 KB, patch)
2010-09-16 05:38 EDT, Michal Novotny
no flags Details | Diff

  None (edit)
Description Jiri Denemark 2009-06-05 08:15:49 EDT
Description of problem:

xc_save reports success although there are errors printed to xend.log and resulting image cannot be used for resuming. Save/restore feature is known to be broken and tracked by https://bugzilla.redhat.com/show_bug.cgi?id=437348 and xc_save shouldn't report success in that case. However this won't be fixed in 5.4 and Chris Lalancette suggested we should made xc_save fail for IA64...

This is shown in xend.log when saving an IA64 guest:

[2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
28 gpfn 28: Invalid argument
[2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
29 gpfn 29: Invalid argument
[2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
2a gpfn 2a: Invalid argument
[2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
2b gpfn 2b: Invalid argument
[2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
2c gpfn 2c: Invalid argument
[2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
2d gpfn 2d: Invalid argument
[2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
2e gpfn 2e: Invalid argument
[2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
2f gpfn 2f: Invalid argument
[2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) All memory is saved
[2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) ip=a00000020088daa0,
b0=a000000200898150
[2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) Save exit rc=0


Version-Release number of selected component (if applicable):

xen-3.0.3-86.el5

How reproducible:

always

Steps to Reproduce:
1. create IA64 HVM guest
2. xm save guest

  
Actual results:

reports success


Expected results:

failure as it doesn't work

Additional info:

First discussed in https://bugzilla.redhat.com/show_bug.cgi?id=451675 comments #31 to #35.
Comment 1 Chris Lalancette 2009-07-09 08:15:00 EDT
We dropped the ball on this one.  Still something I think we should do, but we'll defer it to 5.5.

Chris Lalancette
Comment 3 Michal Novotny 2010-06-04 06:42:14 EDT
(In reply to comment #0)
> Description of problem:
> 
> xc_save reports success although there are errors printed to xend.log and
> resulting image cannot be used for resuming. Save/restore feature is known to
> be broken and tracked by https://bugzilla.redhat.com/show_bug.cgi?id=437348 and
> xc_save shouldn't report success in that case. However this won't be fixed in
> 5.4 and Chris Lalancette suggested we should made xc_save fail for IA64...
> 
> This is shown in xend.log when saving an IA64 guest:
> 
> [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
> 28 gpfn 28: Invalid argument
> [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
> 29 gpfn 29: Invalid argument
> [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
> 2a gpfn 2a: Invalid argument
> [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
> 2b gpfn 2b: Invalid argument
> [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
> 2c gpfn 2c: Invalid argument
> [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
> 2d gpfn 2d: Invalid argument
> [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
> 2e gpfn 2e: Invalid argument
> [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
> 2f gpfn 2f: Invalid argument
> [2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) All memory is saved
> [2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) ip=a00000020088daa0,
> b0=a000000200898150
> [2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) Save exit rc=0
> 
> 
> Version-Release number of selected component (if applicable):
> 
> xen-3.0.3-86.el5
> 
> How reproducible:
> 
> always
> 
> Steps to Reproduce:
> 1. create IA64 HVM guest
> 2. xm save guest
> 
> 
> Actual results:
> 
> reports success
> 
> 
> Expected results:
> 
> failure as it doesn't work
> 
> Additional info:
> 
> First discussed in https://bugzilla.redhat.com/show_bug.cgi?id=451675 comments
> #31 to #35.    

What does it do when you try to restore the image created ?

Michal
Comment 4 Michal Novotny 2010-06-04 06:48:54 EDT
(In reply to comment #3)
> (In reply to comment #0)
> > Description of problem:
> > 
> > xc_save reports success although there are errors printed to xend.log and
> > resulting image cannot be used for resuming. Save/restore feature is known to
> > be broken and tracked by https://bugzilla.redhat.com/show_bug.cgi?id=437348 and
> > xc_save shouldn't report success in that case. However this won't be fixed in
> > 5.4 and Chris Lalancette suggested we should made xc_save fail for IA64...
> > 
> > This is shown in xend.log when saving an IA64 guest:
> > 
> > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
> > 28 gpfn 28: Invalid argument
> > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
> > 29 gpfn 29: Invalid argument
> > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
> > 2a gpfn 2a: Invalid argument
> > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
> > 2b gpfn 2b: Invalid argument
> > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
> > 2c gpfn 2c: Invalid argument
> > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
> > 2d gpfn 2d: Invalid argument
> > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
> > 2e gpfn 2e: Invalid argument
> > [2009-06-04 18:59:25 xend 6240] INFO (XendCheckpoint:353) cannot map mfn page
> > 2f gpfn 2f: Invalid argument
> > [2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) All memory is saved
> > [2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) ip=a00000020088daa0,
> > b0=a000000200898150
> > [2009-06-04 18:59:38 xend 6240] INFO (XendCheckpoint:353) Save exit rc=0
> > 
> > 
> > Version-Release number of selected component (if applicable):
> > 
> > xen-3.0.3-86.el5
> > 
> > How reproducible:
> > 
> > always
> > 
> > Steps to Reproduce:
> > 1. create IA64 HVM guest
> > 2. xm save guest
> > 
> > 
> > Actual results:
> > 
> > reports success
> > 
> > 
> > Expected results:
> > 
> > failure as it doesn't work
> > 
> > Additional info:
> > 
> > First discussed in https://bugzilla.redhat.com/show_bug.cgi?id=451675 comments
> > #31 to #35.    
> 
> What does it do when you try to restore the image created ?
> 
> Michal    

Oh, sorry, I was reading the discussion on bug 451675 and we should definitely fix it since the restore is failing. I'd recommend just a simple fix to go out instead of continue, i.e.

diff --git a/tools/libxc/ia64/xc_ia64_linux_save.c b/tools/libxc/ia64/xc_ia64_linux_save.c
index 00e71fe..5204657 100644
--- a/tools/libxc/ia64/xc_ia64_linux_save.c
+++ b/tools/libxc/ia64/xc_ia64_linux_save.c
@@ -356,7 +356,7 @@ xc_domain_save(int xc_handle, int io_fd, uint32_t dom, uint32_t max_iters,
                    FIXME: to be tracked.  */
                 fprintf(stderr, "cannot map mfn page %lx gpfn %lx: %s\n",
                         page_array[N], N, safe_strerror(errno));
-                continue;
+                goto out;
             }
 
             if (!write_exact(io_fd, &N, sizeof(N))) {

It's not been tested yet but I'm reserving a machine to test it on. Any objections about the patch?

Michal
Comment 6 Michal Novotny 2010-06-17 11:53:09 EDT
Created attachment 424864 [details]
Fix error reporting for ia64 save

Well, this is the patch to make it report failure if xc_save fails. It's been tested and working fine. I've built it into RPMs available at http://people.redhat.com/minovotn/xen/bz504278

Sakai, could you please test using those RPMs ?

The very same version of this patch has been sent upstream for review.

Thanks,
Michal
Comment 7 Michal Novotny 2010-06-17 12:00:45 EDT
Masayoshi, I've been trying to add contact to Sakai in needinfo but I've noticed it was not added at all so I'm changing needinfo contact to you. Could you please provide me the information from the testing or talk to Sakai about the issue?

Thanks,
Michal
Comment 17 Linqing Lu 2010-09-01 02:45:25 EDT
Created attachment 442343 [details]
entire xend.log

Tested on IA64 machine (kernel-xen-2.6.18-211.el5, xen-3.0.3-115.el5) with IA64 hvm guest.

"xm save" failed, but the guest also hung at the same time.

Part of xend.log:

[2010-09-01 02:38:12 xend 8131] DEBUG (XendCheckpoint:144) [xc_save]: /usr/lib/xen/bin/xc_save 24 36 0 0 4
[2010-09-01 02:38:12 xend 8131] DEBUG (XendCheckpoint:403) suspend
[2010-09-01 02:38:12 xend 8131] DEBUG (XendCheckpoint:147) In saveInputHandler suspend
[2010-09-01 02:38:12 xend 8131] DEBUG (XendCheckpoint:149) Suspending 36 ...
[2010-09-01 02:38:12 xend.XendDomainInfo 8131] DEBUG (XendDomainInfo:1257) XendDomainInfo.handleShutdownWatch
[2010-09-01 02:38:12 xend.XendDomainInfo 8131] DEBUG (XendDomainInfo:1257) XendDomainInfo.handleShutdownWatch
[2010-09-01 02:38:12 xend.XendDomainInfo 8131] INFO (XendDomainInfo:1214) Domain has shutdown: name=migrating-rhel5u5-ia64-hvm id=36 reason=suspend.
[2010-09-01 02:38:12 xend 8131] INFO (XendCheckpoint:154) Domain 36 suspended.
[2010-09-01 02:38:12 xend 8131] INFO (XendCheckpoint:159) _releaseDevices for hvm domain
[2010-09-01 02:38:12 xend 8131] INFO (image:482) use sigusr1 to signal qemu 8167
[2010-09-01 02:38:14 xend 8131] DEBUG (XendCheckpoint:163) Written done
[2010-09-01 02:38:14 xend 8131] INFO (XendCheckpoint:452) cannot map mfn page 28 gpfn 28: Invalid argument
[2010-09-01 02:38:14 xend 8131] INFO (XendCheckpoint:452) Save exit rc=1
[2010-09-01 02:38:14 xend 8131] DEBUG (XendCheckpoint:428) Domain 36 renamed back to rhel5u5-ia64-hvm
[2010-09-01 02:38:14 xend.XendDomainInfo 8131] DEBUG (XendDomainInfo:1257) XendDomainInfo.handleShutdownWatch
[2010-09-01 02:38:14 xend 8131] INFO (image:434) spawning device models: /usr/lib/xen/bin/qemu-dm ['/usr/lib/xen/bin/qemu-dm', '-d', '36', '-m', '1024', '-boot', 'd', '-vcpus', '2', '-acpi', '-usbdevice', 'tablet', '-k', 'en-us', '-domain-name', 'rhel5u5-ia64-hvm', '-vnc', '0.0.0.0:36', '-vncunused', '-loadvm', '/var/lib/xen/qemu-save-36.img']
[2010-09-01 02:38:14 xend 8131] INFO (image:437) device model pid: 8314
[2010-09-01 02:38:14 xend 8131] ERROR (XendCheckpoint:190) Save failed on domain rhel5u5-ia64-hvm (36).
Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 167, in save
    forkHelper(cmd, fd, saveInputHandler, False, dominfo)
  File "/usr/lib/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 440, in forkHelper
    raise XendError("%s failed" % string.join(cmd))
XendError: /usr/lib/xen/bin/xc_save 24 36 0 0 4 failed
[2010-09-01 02:38:14 xend.XendDomainInfo 8131] DEBUG (XendDomainInfo:2275) XendDomainInfo.resumeDomain(36)
[2010-09-01 02:38:14 xend.XendDomainInfo 8131] ERROR (XendDomainInfo:2303) XendDomainInfo.resume: xc.domain_resume failed on domain 36.
Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2300, in resumeDomain
    xc.domain_resume(self.domid, fast)
Error: (38, 'Function not implemented')
[2010-09-01 02:38:14 xend 8131] DEBUG (XendCheckpoint:194) XendCheckpoint.save: resumeDomain
Comment 18 Michal Novotny 2010-09-07 07:09:43 EDT
(In reply to comment #17)
> Created attachment 442343 [details]
> entire xend.log
> 
> Tested on IA64 machine (kernel-xen-2.6.18-211.el5, xen-3.0.3-115.el5) with IA64
> hvm guest.
> 
> "xm save" failed, but the guest also hung at the same time.
> 

Well, this is exactly what I saw there. Of course, the behaviour is not correct since it shouldn't hang nevertheless it appears to be a kernel-xen bug in the PV domU since the guest is PV guest.

Michal
Comment 19 Michal Novotny 2010-09-15 08:46:21 EDT
Well, I was talking about the original bug nevertheless testing in comment #17 is about HVM guest and this one is about save functionality. Save functionality is not implemented for HVM guests on ia64 platform AFAIK so I guess we should create a patch to disable save on IA64 HVM guest. The guest is being stuck since it's trying to call resume functionality which would require many backports from upstream so I guess fixing this in the user-space stack to disallow is a good idea. The patch to disallow save on IA64 HVM guest even before anything is done to the guest will be coming soon.

Michal
Comment 20 Michal Novotny 2010-09-15 09:00:26 EDT
This is the test version of my patch to disable save/migrate of HVM guest on ia64 platform:

diff --git a/tools/python/xen/xend/XendDomain.py b/tools/python/xen/xend/XendDomain.py
index 2dcfe3c..447b9cc 100644
--- a/tools/python/xen/xend/XendDomain.py
+++ b/tools/python/xen/xend/XendDomain.py
@@ -34,6 +34,7 @@ import XendDomainInfo
 
 from xen.xend import XendRoot
 from xen.xend import XendCheckpoint
+from xen.xend import arch
 from xen.xend.XendError import XendError, XendInvalidDomain
 from xen.xend.XendLogging import log
 from xen.xend.xenstore.xstransact import xstransact
@@ -489,6 +490,9 @@ class XendDomain:
             raise XendError("Can't migrate the domain since the domain is paused; "
                                 "unpause it first if you want to migrate it")
 
+        if arch.type == "ia64" and dominfo.is_hvm():
+            raise XendError("Migrating HVM guest on ia64 platform is not implemented")
+
         """ The following call may raise a XendError exception """
         dominfo.testMigrateDevices(True, dst)
 
@@ -536,6 +540,9 @@ class XendDomain:
                 raise XendError("Can't save the domain since the domain is paused; "
                                     "unpause it first if you want to save it")
 
+            if arch.type == "ia64" and dominfo.is_hvm():
+                raise XendError("Saving HVM guest on ia64 platform is not implemented")
+
             fd = os.open(dst, os.O_WRONLY | os.O_CREAT | os.O_TRUNC)
             oshelp.close_on_exec(fd)
             try:

Michal
Comment 21 Paolo Bonzini 2010-09-15 11:31:12 EDT
See bug 622413 comment #9 to #15.
Comment 23 Michal Novotny 2010-09-16 05:38:25 EDT
Created attachment 447697 [details]
Patch to disable ia64 HVM save/migrate call since it's not implemented

Hi,
this is the patch to disable calling save/migrate on ia64 HVM guests since it's not implemented in RHEL-5 version of Xen so this patch makes it bail with error before anything is being done to the guest.

Testing: The patch has been tested on x86 host with x86 HVM guest with changed arch.type comparison to 'x86' instead of 'ia64' since I was unable to get access to ia64 machine. Before my patch applied it tried to save or migrate the guest but with this patch applied it bails with error that migrating or saving HVM guests on ia64 is not implemented and the guest is still running (i.e. it's not being hang).

Michal
Comment 25 Miroslav Rezanina 2010-10-01 03:09:44 EDT
Fix built into  xen-3.0.3-117.el5
Comment 26 Linqing Lu 2010-10-27 01:13:35 EDT
Verified on RHEL5.5-Server-20100322.0-ia64 host, RHEL-Server-5.5-hvm-ia64 guest.
"xm save" returns expected error message, and the guest still works well.

-----------------------

# xm li
Name                                      ID Mem(MiB) VCPUs State   Time(s)
Domain-0                                   0     3996     2 r-----  16414.4
vm3                                       27     1047     1 -b----   1943.7

# xm save vm3 vm3.save
Error: Saving HVM guest on ia64 is not implemented
Usage: xm save <Domain> <CheckpointFile>

Save a domain state to restore later.
Comment 29 errata-xmlrpc 2011-01-13 17:17:24 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0031.html

Note You need to log in before you can comment on or make changes to this bug.