Bug 480118

Summary: [RHEL5 Xen]: xm dump-core with 32-on-64 PV guest fails
Product: Red Hat Enterprise Linux 5 Reporter: Chris Lalancette <clalance>
Component: xenAssignee: Jiri Denemark <jdenemar>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 5.3CC: berrange, cward, jplans, mshao, pep, samuel.kielek, tao, xen-maint
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-02 10:09:06 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 425411    
Bug Blocks:    
Attachments:
Description Flags
Patch to fix this bug none

Description Chris Lalancette 2009-01-15 08:14:28 UTC
Description of problem:
If you start a 32 bit paravirtualized guest on a 64 bit hypervisor, and then attempt to run:

# xm dump-core -C <32bitguest>

The dump-core will fail with:

[root@amd1 ~]# xm dump-core -C rhel5pv_i386
Dumping core of domain: rhel5pv_i386 ...
Error: Failed to dump core: (1, 'Internal error', "Couldn't map p2m_frame_list_list (errno 14) (14 = Bad address)")
Usage: xm dump-core [-L|--live] [-C|--crash] <Domain> [Filename]

Dump core for a specific domain.
[root@amd1 ~]#

In /var/log/xen/xend.log, you'll also see the following:

[2009-01-15 03:11:32 xend 4030] INFO (XendDomain:415) Domain core dump requested for domain rhel5pv_i386 (1) live=0 crash=1.
[2009-01-15 03:11:32 xend.XendDomainInfo 4030] ERROR (XendDomainInfo:1102) XendDomainInfo.dumpCore failed: id = 1 name = rhel5pv_i386
Traceback (most recent call last):
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 1097, in dumpCore
    xc.domain_dumpcore(self.domid, corefile)
Error: (1, 'Internal error', "Couldn't map p2m_frame_list_list (errno 14) (14 = Bad address)")

So, at least to start with, this is *probably* a user-side issue.  It may end up being a kernel side issue as well, but we have to start by analyzing what is going on in the user side first.

FYI, there was also a patch posted today to xen-devel that may help with this issue:

http://lists.xensource.com/archives/html/xen-devel/2009-01/msg00412.html

Comment 1 Daniel Berrangé 2009-01-15 13:38:40 UTC
This is committed upstream now too

http://xenbits.xensource.com/staging/xen-unstable.hg?rev/ecf603780f56

Comment 2 Jiri Denemark 2009-02-02 13:27:14 UTC
Created attachment 330623 [details]
Patch to fix this bug

This issue is quite similar to the 32-on-64 save/restore bug as the core-dumping feature is implemented in a very smart way: several pieces of the code are just verbatim copies from the code for saving guests.

Upstream got it fixed by c/s 19046 (http://xenbits.xensource.com/xen-unstable.hg?rev/ecf603780f56) and c/s 19052 (http://xenbits.xensource.com/xen-unstable.hg?rev/0ab57e6e440a), which actually fixes the first one.

The attached patched is a backported version of the two upstream changesets
with the following changes:
    - irrelevant part omitted
    - macros and data types introduced by 32-on-64 save/restore bugfix were
      reused by including xg_save_restore.h instead of duplicating them
    - compat unions *_either_t & co. were renamed *_any_t in upstream in the
      meantime; this patch uses the old names

Comment 10 Jiri Denemark 2009-02-23 11:10:07 UTC
A test package which fixes this issue (and several others as well) has been
made available at:

http://people.redhat.com/jdenemar/xen/

Could the reporter try it out and report if it fixes the problem or not?

Thank you for your cooperation.

Comment 11 Jiri Denemark 2009-03-02 10:22:39 UTC
Fix built into xen-3.0.3-81.el5

Comment 13 Chris Ward 2009-07-03 18:20:42 UTC
~~ Attention - RHEL 5.4 Beta Released! ~~

RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner!

If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity.

Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value.

Questions can be posted to this bug or your customer or partner representative.

Comment 14 Yewei Shao 2009-07-28 08:04:34 UTC
Verified on xen-3.0.3-91.el5

Comment 16 errata-xmlrpc 2009-09-02 10:09:06 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-1328.html