Bug 221273
| Summary: | crash on live RHEL4 x86_64 domU fails to start | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 4 | Reporter: | Jeff Layton <jlayton> | ||||
| Component: | kernel | Assignee: | Chris Lalancette <clalance> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 4.5 | CC: | anderson, ddutile, steved, xen-maint | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | RHBA-2007-0304 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2007-05-08 04:35:55 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Jeff Layton
2007-01-03 14:36:16 UTC
Three questions first: 1. Did this work OK on earlier kernels? 2. Does the same thing happen when you xendump it? 3. What's the output of "crash -d7"? Here's the crash -d7 output. The other 2 questions I'm not sure of: 1) I haven't tested earlier kernels. Do you have a suggested one I should test? 2) by xendump do you mean an "xm save" or something else? ----------[snip]----------- # crash -d7 crash 4.0-3.16 Copyright (C) 2002, 2003, 2004, 2005, 2006 Red Hat, Inc. Copyright (C) 2004, 2005, 2006 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006 Fujitsu Limited Copyright (C) 2006 VA Linux Systems Japan K.K. Copyright (C) 2005 NEC Corporation Copyright (C) 1999, 2002 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. find_booted_kernel: search for [Linux version 2.6.9-42.36.EL.TEST.bz184549.3xenU (root.redhat.com) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)) #1 SMP Wed Jan 3 08:23:41 EST 2007 ] mount_points[0]: / (947210) mount_points[1]: /proc (947230) mount_points[2]: /dev (947250) mount_points[3]: / (947270) mount_points[4]: /dev (947290) mount_points[5]: /selinux (9472b0) mount_points[6]: /proc (9472d0) mount_points[7]: /sys (9472f0) mount_points[8]: /dev/pts (947310) mount_points[9]: /dev/shm (947330) mount_points[10]: /proc/sys/fs/binfmt_misc (947350) mount_points[11]: /var/lib/nfs/rpc_pipefs (947390) searchdirs[8]: /usr/lib/debug/lib/modules/2.6.9-42.36.EL.TEST.bz184549.3xenU/ searchdirs[0]: /usr/src/linux/ searchdirs[1]: /boot/ searchdirs[2]: /boot/efi/redhat searchdirs[3]: /boot/efi/EFI/redhat searchdirs[4]: / searchdirs[5]: /usr/src/debug/ searchdirs[6]: /usr/src/redhat/BUILD/kernel-2.6.9/linux/ searchdirs[7]: /usr/src/redhat/BUILD/kernel-2.6.9/linux-2.6.9/ find_booted_kernel: check: /usr/lib/debug/lib/modules/2.6.9-42.36.EL.TEST.bz184549.3xenU/vmlinux find_booted_kernel: found: /usr/lib/debug/lib/modules/2.6.9-42.36.EL.TEST.bz184549.3xenU/vmlinux get_live_memory_source: /dev/crash /sbin/modprobe crash "crash" module loaded: [crash][7745][0] /proc/misc: 62 crash => 10/62 /proc/version: Linux version 2.6.9-42.36.EL.TEST.bz184549.3xenU (root.redhat.com) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)) #1 SMP Wed Jan 3 08:23:41 EST 2007 /usr/lib/debug/lib/modules/2.6.9-42.36.EL.TEST.bz184549.3xenU/vmlinux: Linux version 2.6.9-42.36.EL.TEST.bz184549.3xenU (root.redhat.com) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)) #1 SMP Wed Jan 3 08:23:41 EST 2007 gdb /usr/lib/debug/lib/modules/2.6.9-42.36.EL.TEST.bz184549.3xenU/vmlinux GNU gdb 6.1 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-unknown-linux-gnu"... <readmem: ffffffff8038db08, KVADDR, "phys_to_machine_mapping", 8, (FOE), 8f12d8> crash: read error: kernel virtual address: ffffffff8038db08 type: "phys_to_machine_mapping" "crash" module loaded: [crash][7745][0] /sbin/rmmod crash > 1) I haven't tested earlier kernels. Do you have a suggested one I should test? Apparently crash was never tested on live x86_64 domU kernels, only on xendump and "xm save" dumpfiles. So I'm presuming that it doesn't work with any "stock" RHEL4 x86_64 domU kernel? > 2) by xendump do you mean an "xm save" or something else? Alt-sysrq-c from the domU or "xm dump-core" from the dom0. If I do an "xm dump-core" and then run crash against the corefile with the same debuginfo it works fine. The patches in my kernel are all NFS-related, so I doubt they are affecting this, but I'll test with a stock -42.36 kernel and report back. I get the same results with a stock -42.36 kernel as well: crash: read error: kernel virtual address: ffffffff8038db08 type: "phys_to_machine_mapping" Ok thanks. That's the very first read attempted, so we're not getting too far are we? It's not entirely clear at this point whether this is a crash utility or a /dev/crash driver (kernel) issue. If you do a "dmesg" just after the crash utility attempt fails, are there any messages from the /dev/crash driver? Chris, do you have a set-up that I can work/squat on? Yep, there are some messages that go to the ring buffer when it fails: crash memory driver: version 1.0 crash memory driver: !page_is_ram(pfn: 38d)
Ok good -- this is a RHEL4 x86_64 xen kernel issue with page_is_ram().
In RHEL5, it's #define'd in arch/x86_64/mm/init-xen.c
static inline int page_is_ram (unsigned long pagenr)
{
return 1;
}
EXPORT_SYMBOL_GPL(page_is_ram);
Chris is going to address this in his kernel one way or another.
Jeff, you did absolutely confirm that the i686 RHEL4 xenU kernel
runs crash OK on a live system, right?
Yes, it definitely works on i686. Though there, I'm using an older rev of crash: crash 4.0-3.4 ...I'm presuming that doesn't matter though. It shouldn't -- but the version of crash that I'm currently pushing through the RHEL4-U5 errata process is 4.0-3.9, which you can grab from: porkchop:/mnt/redhat/brewroot/packages/crash/4.0/3.9 i686 still works with 4.0-3.9 Created attachment 144723 [details]
Fix for page_is_ram for x86_64 live crash
committed in stream U5 build 42.40. A test kernel with this patch is available from http://people.redhat.com/~jbaron/rhel4/ This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. QE ack for RHEL4.5. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0304.html |