Bug 480118 - [RHEL5 Xen]: xm dump-core with 32-on-64 PV guest fails
Summary: [RHEL5 Xen]: xm dump-core with 32-on-64 PV guest fails
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: xen
Version: 5.3
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Jiri Denemark
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Keywords:
Depends On: 425411
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-01-15 08:14 UTC by Chris Lalancette
Modified: 2018-10-20 01:21 UTC (History)
8 users (show)

(edit)
Clone Of:
(edit)
Last Closed: 2009-09-02 10:09:06 UTC


Attachments (Terms of Use)
Patch to fix this bug (12.16 KB, patch)
2009-02-02 13:27 UTC, Jiri Denemark
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:1328 normal SHIPPED_LIVE xen bug fix and enhancement update 2009-09-01 10:32:30 UTC

Description Chris Lalancette 2009-01-15 08:14:28 UTC
Description of problem:
If you start a 32 bit paravirtualized guest on a 64 bit hypervisor, and then attempt to run:

# xm dump-core -C <32bitguest>

The dump-core will fail with:

[root@amd1 ~]# xm dump-core -C rhel5pv_i386
Dumping core of domain: rhel5pv_i386 ...
Error: Failed to dump core: (1, 'Internal error', "Couldn't map p2m_frame_list_list (errno 14) (14 = Bad address)")
Usage: xm dump-core [-L|--live] [-C|--crash] <Domain> [Filename]

Dump core for a specific domain.
[root@amd1 ~]#

In /var/log/xen/xend.log, you'll also see the following:

[2009-01-15 03:11:32 xend 4030] INFO (XendDomain:415) Domain core dump requested for domain rhel5pv_i386 (1) live=0 crash=1.
[2009-01-15 03:11:32 xend.XendDomainInfo 4030] ERROR (XendDomainInfo:1102) XendDomainInfo.dumpCore failed: id = 1 name = rhel5pv_i386
Traceback (most recent call last):
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 1097, in dumpCore
    xc.domain_dumpcore(self.domid, corefile)
Error: (1, 'Internal error', "Couldn't map p2m_frame_list_list (errno 14) (14 = Bad address)")

So, at least to start with, this is *probably* a user-side issue.  It may end up being a kernel side issue as well, but we have to start by analyzing what is going on in the user side first.

FYI, there was also a patch posted today to xen-devel that may help with this issue:

http://lists.xensource.com/archives/html/xen-devel/2009-01/msg00412.html

Comment 1 Daniel Berrange 2009-01-15 13:38:40 UTC
This is committed upstream now too

http://xenbits.xensource.com/staging/xen-unstable.hg?rev/ecf603780f56

Comment 2 Jiri Denemark 2009-02-02 13:27:14 UTC
Created attachment 330623 [details]
Patch to fix this bug

This issue is quite similar to the 32-on-64 save/restore bug as the core-dumping feature is implemented in a very smart way: several pieces of the code are just verbatim copies from the code for saving guests.

Upstream got it fixed by c/s 19046 (http://xenbits.xensource.com/xen-unstable.hg?rev/ecf603780f56) and c/s 19052 (http://xenbits.xensource.com/xen-unstable.hg?rev/0ab57e6e440a), which actually fixes the first one.

The attached patched is a backported version of the two upstream changesets
with the following changes:
    - irrelevant part omitted
    - macros and data types introduced by 32-on-64 save/restore bugfix were
      reused by including xg_save_restore.h instead of duplicating them
    - compat unions *_either_t & co. were renamed *_any_t in upstream in the
      meantime; this patch uses the old names

Comment 10 Jiri Denemark 2009-02-23 11:10:07 UTC
A test package which fixes this issue (and several others as well) has been
made available at:

http://people.redhat.com/jdenemar/xen/

Could the reporter try it out and report if it fixes the problem or not?

Thank you for your cooperation.

Comment 11 Jiri Denemark 2009-03-02 10:22:39 UTC
Fix built into xen-3.0.3-81.el5

Comment 13 Chris Ward 2009-07-03 18:20:42 UTC
~~ Attention - RHEL 5.4 Beta Released! ~~

RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner!

If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity.

Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value.

Questions can be posted to this bug or your customer or partner representative.

Comment 14 Yewei Shao 2009-07-28 08:04:34 UTC
Verified on xen-3.0.3-91.el5

Comment 16 errata-xmlrpc 2009-09-02 10:09:06 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-1328.html


Note You need to log in before you can comment on or make changes to this bug.