Description of problem:
If you start a 32 bit paravirtualized guest on a 64 bit hypervisor, and then attempt to run:
# xm dump-core -C <32bitguest>
The dump-core will fail with:
[root@amd1 ~]# xm dump-core -C rhel5pv_i386
Dumping core of domain: rhel5pv_i386 ...
Error: Failed to dump core: (1, 'Internal error', "Couldn't map p2m_frame_list_list (errno 14) (14 = Bad address)")
Usage: xm dump-core [-L|--live] [-C|--crash] <Domain> [Filename]
Dump core for a specific domain.
In /var/log/xen/xend.log, you'll also see the following:
[2009-01-15 03:11:32 xend 4030] INFO (XendDomain:415) Domain core dump requested for domain rhel5pv_i386 (1) live=0 crash=1.
[2009-01-15 03:11:32 xend.XendDomainInfo 4030] ERROR (XendDomainInfo:1102) XendDomainInfo.dumpCore failed: id = 1 name = rhel5pv_i386
Traceback (most recent call last):
File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 1097, in dumpCore
Error: (1, 'Internal error', "Couldn't map p2m_frame_list_list (errno 14) (14 = Bad address)")
So, at least to start with, this is *probably* a user-side issue. It may end up being a kernel side issue as well, but we have to start by analyzing what is going on in the user side first.
FYI, there was also a patch posted today to xen-devel that may help with this issue:
This is committed upstream now too
Created attachment 330623 [details]
Patch to fix this bug
This issue is quite similar to the 32-on-64 save/restore bug as the core-dumping feature is implemented in a very smart way: several pieces of the code are just verbatim copies from the code for saving guests.
Upstream got it fixed by c/s 19046 (http://xenbits.xensource.com/xen-unstable.hg?rev/ecf603780f56) and c/s 19052 (http://xenbits.xensource.com/xen-unstable.hg?rev/0ab57e6e440a), which actually fixes the first one.
The attached patched is a backported version of the two upstream changesets
with the following changes:
- irrelevant part omitted
- macros and data types introduced by 32-on-64 save/restore bugfix were
reused by including xg_save_restore.h instead of duplicating them
- compat unions *_either_t & co. were renamed *_any_t in upstream in the
meantime; this patch uses the old names
A test package which fixes this issue (and several others as well) has been
made available at:
Could the reporter try it out and report if it fixes the problem or not?
Thank you for your cooperation.
Fix built into xen-3.0.3-81.el5
~~ Attention - RHEL 5.4 Beta Released! ~~
RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner!
If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity.
Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value.
Questions can be posted to this bug or your customer or partner representative.
Verified on xen-3.0.3-91.el5
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.