Bug 210291

Summary: kdump file size is almost double in size than that of RAM size of the system
Product: [Fedora] Fedora Reporter: IBM Bug Proxy <bugproxy>
Component: kernelAssignee: Neil Horman <nhorman>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: davej, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-04-23 18:58:15 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description IBM Bug Proxy 2006-10-11 12:03:11 UTC
LTC Owner is: ankigarg.com
LTC Originator is: iranna.ankad.com


---Problem Description---
The dump file size is almost double the size than that of RAM size.

[root@x370 ~]# cat /proc/meminfo | grep Mem
MemTotal:      3760836 kB
MemFree:       3180944 kB

[root@x370 ~]# du -sh /var/crash/2006-09-22-22\:01/
7.3G    /var/crash/2006-09-22-22:01/

[root@x370 ~]# ls -l /var/crash/2006-09-22-22\:01/
total 7569664
-r-------- 1 root root 8065486848 Sep 22 22:09 vmcore


 
Contact Information = primary: iranna.ankad.com
 
---uname output---
2.6.17-1.2519.4.26.el5
 
Machine Type = x370
 
---Debugger---
A debugger is not configured
 
---Steps to Reproduce---
1. configured kdump from command line:
/sbin/kexec -p --command-line="ro root=LABEL=/ rhgb console=tty0
console=ttyS0,38400n1  irqpoll maxcpus=1"
--initrd=/boot/initrd-2.6.17-1.2519.4.26.el5.img
/boot/vmlinuz-2.6.17-1.2519.4.26.el5

2. echo 1 > /proc/sys/kernel/sysrq
   echo c > /proc/sysrq-trigger

  
Note: 
1. Also noticed /proc/vmcore file size approx. 7.5GB.
2. In both the cases, I could open the dump files using crash & could execute 
some commands & see valid information.
3. Tried to use both kexec-tools-1.101-54.el5.i386.rpm (default in
RHEL5-B1-Refresh) & latest kexec-tools-1.101-69.el5.i386.rpm

Tried kdump on x206 machine and this time I observed the dump size is smaller
(478MB) than that of RAM size (2 GB)of the machine. 

However all these dumps can be opened using crash & find some information !!

[root@x206f ~]# grep Mem /proc/meminfo
MemTotal:      2007472 kB
MemFree:       1573736 kB

[root@x206f ~]# df -h       --> before taking dump
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2             9.5G  5.8G  3.3G  64% /
/dev/sda1             130M   66M   58M  54% /boot
tmpfs                 981M     0  981M   0% /dev/shm
/dev/sda3             9.5G  482M  8.6G   6% /home

------------after taking dump, vmcore saved into /var/crash/ -----------------

[root@x206f ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2             9.5G  6.2G  2.8G  70% /
/dev/sda1             130M   66M   58M  54% /boot
tmpfs                 981M     0  981M   0% /dev/shm
/dev/sda3             9.5G  482M  8.6G   6% /home

[root@x206f 2006-09-26-15:33]# ls -lh
total 478M
-r-------- 1 root root 2.0G Sep 26 15:34 vmcore
[root@x206f 2006-09-26-15:33]# ls -l
total 488480
-r-------- 1 root root 2079392416 Sep 26 15:34 vmcore
[root@x206f 2006-09-26-15:33]# du -sh vmcore
478M    vmcore


(In reply to comment #2)
> Tried kdump on x206 machine and this time I observed the dump size is smaller
> (478MB) than that of RAM size (2 GB)of the machine. 
> 
> However all these dumps can be opened using crash & find some information !!
> 
> [root@x206f ~]# grep Mem /proc/meminfo
> MemTotal:      2007472 kB
> MemFree:       1573736 kB
> 
> [root@x206f ~]# df -h       --> before taking dump
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/sda2             9.5G  5.8G  3.3G  64% /
> /dev/sda1             130M   66M   58M  54% /boot
> tmpfs                 981M     0  981M   0% /dev/shm
> /dev/sda3             9.5G  482M  8.6G   6% /home
> 
> ------------after taking dump, vmcore saved into /var/crash/ ----------------
-
> 
> [root@x206f ~]# df -h
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/sda2             9.5G  6.2G  2.8G  70% /
> /dev/sda1             130M   66M   58M  54% /boot
> tmpfs                 981M     0  981M   0% /dev/shm
> /dev/sda3             9.5G  482M  8.6G   6% /home
> 

I feel this is not a true indicator of any discrepancy in the vmcore file size
since the 'ls' and 'du' command are indicating the right size. 

> [root@x206f 2006-09-26-15:33]# ls -lh
> total 478M
   ^^^^^^^^^^ is the size in terms of number of blocks. 
  
> -r-------- 1 root root 2.0G Sep 26 15:34 vmcore
                        ^^^^^^ the right size in bytes
  
> [root@x206f 2006-09-26-15:33]# ls -l
> total 488480
> -r-------- 1 root root 2079392416 Sep 26 15:34 vmcore
> [root@x206f 2006-09-26-15:33]# du -sh vmcore
> 478M    vmcore
 ^^^^^^^ This is again in number of blocks
You need to specify the -b option to 'du' to obtain the size in bytes. The
following is what I got on the same machine :

[root@x206f 2006-09-26-16:58]# du -sbh vmcore
2.0G    vmcore

(In reply to comment #0)

Could you provide the output of 'readelf -{h,l,S}' command for the vmcore file
you get? 

(In reply to comment #5)
> (In reply to comment #0)
> 
> Could you provide the output of 'readelf -{h,l,S}' command for the vmcore 
file
> you get? 

When I ran 'readelf' on the machine on which I took the dump, got the following
error :

# readelf -h vmcore
readelf: Error: Could not locate 'vmcore'.  System error message: Value too
large for defined data type

But when tranferred the dump to another machine, the output of 'readelf' is :

# readelf -h vmcore
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              CORE (Core file)
  Machine:                           Intel 80386
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          52 (bytes into file)
  Start of section headers:          0 (bytes into file)
  Flags:                             0x0
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         6
  Size of section headers:           0 (bytes)
  Number of section headers:         0
  Section header string table index: 0

# readelf -l vmcore

Elf file type is CORE (Core file)
Entry point 0x0
There are 6 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  NOTE           0x0000f4 0x00000000 0x00000000 0x00290 0x00290     0
  LOAD           0x000384 0xc0000000 0x00000000 0xa0000 0xa0000 RWE 0
  LOAD           0x0a0384 0xc0100000 0x00100000 0xf00000 0xf00000 RWE 0
  LOAD           0xfa0384 0xc5000000 0x05000000 0x33000000 0x33000000 RWE 0
  LOAD           0x33fa0384 0xffffffff 0x38000000 0x9ffb0580 0x9ffb0580 RWE 0
  LOAD           0xd3f50904 0xffffffff 0x00000000 0x28000000 0x28000000 RWE 0

I am working at this issue.

The details of the environment and the above dump are as follows : 

------Meminfo--------------

# cat /proc/meminfo | head
MemTotal:      3434544 kB
MemFree:       2973852 kB
Buffers:         59764 kB
Cached:         336072 kB
SwapCached:          0 kB
Active:         111656 kB
Inactive:       308188 kB
HighTotal:     2621120 kB
HighFree:      2259064 kB
LowTotal:       813424 kB

------Size of vmcore-----

# ls -l vmcore
-r-------- 1 root root 4227139844 Oct  5 10:26 vmcore

------On running crash------

# crash /usr/lib/debug/lib/modules/2.6.17-1.2519.4.26.el5/vmlinux vmcore

crash 4.0-3.1
Copyright (C) 2002, 2003, 2004, 2005, 2006  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005  Fujitsu Limited
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.

GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...

      KERNEL: /usr/lib/debug/lib/modules/2.6.17-1.2519.4.26.el5/vmlinux
    DUMPFILE: vmcore
        CPUS: 4
        DATE: Thu Oct  5 10:25:01 2006
      UPTIME: 00:24:57
LOAD AVERAGE: 0.07, 0.03, 0.01
       TASKS: 120
    NODENAME: llm19.in.ibm.com
     RELEASE: 2.6.17-1.2519.4.26.el5
     VERSION: #1 SMP Mon Sep 11 17:12:50 EDT 2006
     MACHINE: i686  (3400 Mhz)
      MEMORY: 3.4 GB
       PANIC: "Oops: 0000 [#1]" (check log for details)
         PID: 2210
     COMMAND: "bash"
        TASK: f69e8000  [THREAD_INFO: f44f9000]
         CPU: 2
       STATE: TASK_RUNNING (SYSRQ)
(In reply to comment #6)
> (In reply to comment #5)
> > (In reply to comment #0)
> > 
> > Could you provide the output of 'readelf -{h,l,S}' command for the vmcore 
file
> > you get? 
> 
> When I ran 'readelf' on the machine on which I took the dump, got the 
following
> error :
> 
> # readelf -h vmcore
> readelf: Error: Could not locate 'vmcore'.  System error message: Value too
> large for defined data type
> 
> But when tranferred the dump to another machine, the output of 'readelf' is :
> 
> # readelf -h vmcore
> ELF Header:
>   Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
>   Class:                             ELF32
>   Data:                              2's complement, little endian
>   Version:                           1 (current)
>   OS/ABI:                            UNIX - System V
>   ABI Version:                       0
>   Type:                              CORE (Core file)
>   Machine:                           Intel 80386
>   Version:                           0x1
>   Entry point address:               0x0
>   Start of program headers:          52 (bytes into file)
>   Start of section headers:          0 (bytes into file)
>   Flags:                             0x0
>   Size of this header:               52 (bytes)
>   Size of program headers:           32 (bytes)
>   Number of program headers:         6
>   Size of section headers:           0 (bytes)
>   Number of section headers:         0
>   Section header string table index: 0
> 
> # readelf -l vmcore
> 
> Elf file type is CORE (Core file)
> Entry point 0x0
> There are 6 program headers, starting at offset 52
> 
> Program Headers:
>   Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
>   NOTE           0x0000f4 0x00000000 0x00000000 0x00290 0x00290     0
>   LOAD           0x000384 0xc0000000 0x00000000 0xa0000 0xa0000 RWE 0
>   LOAD           0x0a0384 0xc0100000 0x00100000 0xf00000 0xf00000 RWE 0
>   LOAD           0xfa0384 0xc5000000 0x05000000 0x33000000 0x33000000 RWE 0
>   LOAD           0x33fa0384 0xffffffff 0x38000000 0x9ffb0580 0x9ffb0580 RWE 0
>   LOAD           0xd3f50904 0xffffffff 0x00000000 0x28000000 0x28000000 RWE 0
> 
> I am working at this issue.
-----Output of /proc/iomem

# cat /proc/iomem
00000000-0009d3ff : System RAM
0009d400-0009ffff : reserved
000a0000-000bffff : Video RAM area
000c0000-000c8fff : Video ROM
000c9000-000ca5ff : Adapter ROM
000f0000-000fffff : System ROM
00100000-d7fb057f : System RAM
  00100000-00387b8a : Kernel code
  00387b8b-004698c7 : Kernel data
  01000000-04ffffff : Crash kernel
d7fb0580-d7fcffff : ACPI Tables
d7fd0000-d7ffffff : reserved
d9000000-dcffffff : PCI Bus #04
  d9000000-daffffff : PCI Bus #06
    daffe000-daffefff : 0000:06:01.1
    dafff000-daffffff : 0000:06:01.0
  db000000-dcffffff : PCI Bus #05
    dcfe0000-dcfeffff : 0000:05:01.1
      dcfe0000-dcfeffff : tg3
    dcff0000-dcffffff : 0000:05:01.0
      dcff0000-dcffffff : tg3
dd000000-deffffff : PCI Bus #02
  defe0000-defeffff : 0000:02:01.0
  deff0000-deffffff : 0000:02:01.0
df000000-df0fffff : PCI Bus #04
  df000000-df0fffff : PCI Bus #06
    df000000-df01ffff : 0000:06:01.0
    df020000-df03ffff : 0000:06:01.1
df100000-df1fffff : PCI Bus #02
  df100000-df1fffff : 0000:02:01.0
df200000-df2003ff : 0000:00:1f.1
df200400-df20040f : 0000:00:1d.4
  df200400-df20040f : i6300ESB timer
f0000000-f7ffffff : PCI Bus #01
  f0000000-f7ffffff : 0000:01:01.0
f8000000-f8ffffff : PCI Bus #01
  f8000000-f800ffff : 0000:01:01.0
  f8020000-f803ffff : 0000:01:01.0
fec00000-ffffffff : reserved
100000000-127ffffff : System RAM

I think this is a data structure overflow case. Look at the following program
header which is wrong.

LOAD           0xd3f50904 0xffffffff 0x00000000 0x28000000 0x28000000 RWE 0

Actually its physical address field should have been 0x100000000. Looks like
this system has got more than 4G or RAM and we are trying to generate 32bit ELF
headers, that's why the issue.

ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF32

Output from /proc/iomem, showing some chunk of RAM has been mapped at addr
greater than 4G. User elf64 headers for such cases.

100000000-127ffffff : System RAM

Comment 1 Dave Jones 2006-10-11 23:47:19 UTC
*** Bug 210292 has been marked as a duplicate of this bug. ***

Comment 2 Neil Horman 2006-10-12 11:27:59 UTC
Given that /proc/vmcore is a in /proc, I'm not sure that the size of the file is
going to be 100% reliable.  Did you try copying the file to disk, to see if the
saved version was the same size?  Also, I note that you aren't using the kdump
infrastructure to capture your core.  Can you reproduce this using the actual
kdump configuration file and initscripts?  This may just be nothing more than a
cosmetic error in the display of the proc file size.

Comment 3 Dave Jones 2007-04-23 18:58:15 UTC
closing due to inactivity.