Bug 237383 - Crash Command Fails with "crash: read error: kernel virtual address: ffffffff8062a180 type: "xtime"" On Latest Fedora Kernel...
Crash Command Fails with "crash: read error: kernel virtual address: ffffffff...
Status: CLOSED NOTABUG
Product: Fedora
Classification: Fedora
Component: crash (Show other bugs)
6
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Dave Anderson
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-04-21 13:08 EDT by Ken Robson
Modified: 2007-11-30 17:12 EST (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-04-26 16:03:09 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Sysreport ... (9.30 MB, application/octet-stream)
2007-04-21 13:08 EDT, Ken Robson
no flags Details

  None (edit)
Description Ken Robson 2007-04-21 13:08:53 EDT
Description of problem:
Crash Command Fails with "crash: read error: kernel virtual address:
ffffffff8062a180  type: "xtime"" On Latest Fedora Kernel...

Version-Release number of selected component (if applicable):
crash-4.0-3.22 (Updated from stock Fedora to latest available version but still
get same issue)

How reproducible:
Try to start the crash command

Steps to Reproduce:
1. run crash
  
Actual results:
[root@lonfedphymas1 ~]# crash

crash 4.0-3.22
Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

crash: read error: kernel virtual address: ffffffff8062a180  type: "xtime"


Expected results:
A running crash session

Additional info:
Have added a sysreport
Comment 1 Ken Robson 2007-04-21 13:09:09 EDT
Created attachment 153243 [details]
Sysreport ...
Comment 2 Dave Anderson 2007-04-23 08:44:09 EDT
A couple things...

Please indicate the exact kernel version(s) that you are seeing the
failure on.

And secondly, please attach the output of: "crash -d7"
Comment 3 Dave Anderson 2007-04-23 16:38:14 EDT
Never mind the crash -d7 and version info request -- I have a machine, and see
the same thing.
Comment 4 Dave Anderson 2007-04-23 17:31:00 EDT
I'm not sure what the deal is with Fedora kernels, but the crash.ko memory
driver (/dev/crash) is failing because every call that the driver makes to
page_is_ram() is failing.

The crash.ko memory driver was introduced into RHEL kernels because /dev/mem
is severely restricted to the first 256 pages of physical memory.  The Fedora
kernel just inherits the crash.ko driver from RHEL.  It seems to have worked
OK in the original FC6 (2.6.18-based) kernel, but something must have changed
somewhere between then and the 2.6.20-based kernels that FC6 uses now.

Anyway, the newer Fedora kernel doesn't seem to have the /dev/mem restriction,
so /dev/mem can be used (instead of defaulting to /dev/crash).  But it has to
be explicitly put on the crash command line:

# uname -r
2.6.20-1.2944.fc6

# crash /dev/mem

crash 4.0-3.22
Copyright (C) 2002, 2003, 2004, 2005, 2006  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005  Fujitsu Limited
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.

GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

      KERNEL: /vmlinux-2.6.20-1.2944.fc6
    DUMPFILE: /dev/mem
        CPUS: 8
        DATE: Mon Apr 23 17:19:37 2007
      UPTIME: 01:11:30
LOAD AVERAGE: 0.31, 0.35, 0.29
       TASKS: 148
    NODENAME: hp-dl380g5-01.rhts.boston.redhat.com
     RELEASE: 2.6.20-1.2944.fc6
     VERSION: #1 SMP Tue Apr 10 17:46:00 EDT 2007
     MACHINE: x86_64  (1866 Mhz)
      MEMORY: 2 GB
         PID: 5238
     COMMAND: "crash"
        TASK: ffff81007d7dc7c0  [THREAD_INFO: ffff81003ce8e000]
         CPU: 5
       STATE: TASK_RUNNING (ACTIVE)

crash>

I don't tinker with Fedora kernels, but if I get the time, I'll try to
figure out why page_is_ram() fails every time.
Comment 5 Ken Robson 2007-04-24 00:35:33 EDT
Thanks for your prompt assistance Dave - this workaround solves my initial issue..
Comment 6 Dave Anderson 2007-04-24 09:42:12 EDT
The problem looks to be this change to the kernel's declaration
of the x86_64 e820_map:

   struct e820map e820 __initdata;

By making it "__initdata", the structure's memory gets reallocated
after kernel initialization, and therefore page_is_ram() can no longer
function correctly.

Comment 7 Dave Anderson 2007-04-26 15:49:15 EDT
Posted patch to Fedora-kernel-list@redhat.com:

 https://www.redhat.com/archives/fedora-kernel-list/2007-April/msg00037.html


   From: Dave Anderson <anderson@redhat.com>
     To: Fedora-kernel-list@redhat.com
     Cc:
Subject: [PATCH FC6] x86_64 page_is_ram() uses __initdata; breaks /dev/crash and
/dev/mem restriction
   Date: Thu, 26 Apr 2007 15:40:00 -0400

Somewhere after the 2.6.18 timeframe, Andi Kleen made the
x86_64 e820 map __initdata:

  struct e820map e820 __initdata;

This is fine for upstream x86_64 kernels, because the e820 map never
gets used during runtime.  But because we (RHEL/Fedora) have an x86_64
version of page_is_ram(), it ends up using __init data.

I became aware of this when I got an FC6/crash-utility bugzilla #237383,
filed because the crash utility started failing on later FC6 live systems.
This is because the kernel's /dev/crash module crash.ko started failing
because it uses page_is_ram() as a memory access qualifer.

Also, because of Red Hat's restriction on the use of /dev/mem to only
the first 256 RAM pages, the __initdata addition ends up opening the flood
gates for /dev/mem usage.  That's because of this:

  /*
   * devmem_is_allowed() checks to see if /dev/mem access to a certain address is
   * valid. The argument is a physical page number.
   *
   *
   * On x86-64, access has to be given to the first megabyte of ram because that
area
   * contains bios code and data regions used by X and dosemu and similar apps.
   * Access has to be given to non-kernel-ram areas as well, these contain the PCI
   * mmio resources as well as potential bios/acpi data regions.
   */
  int devmem_is_allowed(unsigned long pagenr)
  {
          if (pagenr <= 256)
                  return 1;
          if (!page_is_ram(pagenr))
                  return 1;
          return 0;
  }

The function is meant to allow /dev/mem accesses above pagenr 256
only if they are *not* RAM -- but since since page_is_ram() is failing,
it inadvertantly allows access to any pagenr.

(Interestingly enough, this bug allows the user of the crash
utility to work around the /dev/crash failure by alternatively
using /dev/mem instead!  /dev/crash was only created to begin
with because of the /dev/mem restriction...

Anyway, reverting back from __initdata fixes the situation for both
/dev/crash and the /dev/mem restriction.

Dave Anderson

(patch is against 2.6.20-1.2944)



--- linux-2.6.20.x86_64/arch/x86_64/kernel/e820.c.orig	2007-04-26
14:38:12.000000000 -0400
+++ linux-2.6.20.x86_64/arch/x86_64/kernel/e820.c	2007-04-26 14:38:24.000000000
-0400
@@ -25,7 +25,7 @@
 #include <asm/bootsetup.h>
 #include <asm/sections.h>
 
-struct e820map e820 __initdata;
+struct e820map e820;
 
 /* 
  * PFN of last memory page.
Comment 8 Dave Anderson 2007-04-26 16:03:09 EDT
Hi Ken,

I'm going to close this as NOTABUG, since the crash command line usage
of /dev/mem is legitimate.  However, when it exists, /dev/crash should
should be always be used by default, and so the bug is truly in the
Fedora kernel.  I appreciate the report.

Thanks,
  Dave



Note You need to log in before you can comment on or make changes to this bug.