LTC Owner is: ankigarg.com LTC Originator is: iranna.ankad.com ---Problem Description--- The dump file size is almost double the size than that of RAM size. [root@x370 ~]# cat /proc/meminfo | grep Mem MemTotal: 3760836 kB MemFree: 3180944 kB [root@x370 ~]# du -sh /var/crash/2006-09-22-22\:01/ 7.3G /var/crash/2006-09-22-22:01/ [root@x370 ~]# ls -l /var/crash/2006-09-22-22\:01/ total 7569664 -r-------- 1 root root 8065486848 Sep 22 22:09 vmcore Contact Information = primary: iranna.ankad.com ---uname output--- 2.6.17-1.2519.4.26.el5 Machine Type = x370 ---Debugger--- A debugger is not configured ---Steps to Reproduce--- 1. configured kdump from command line: /sbin/kexec -p --command-line="ro root=LABEL=/ rhgb console=tty0 console=ttyS0,38400n1 irqpoll maxcpus=1" --initrd=/boot/initrd-2.6.17-1.2519.4.26.el5.img /boot/vmlinuz-2.6.17-1.2519.4.26.el5 2. echo 1 > /proc/sys/kernel/sysrq echo c > /proc/sysrq-trigger Note: 1. Also noticed /proc/vmcore file size approx. 7.5GB. 2. In both the cases, I could open the dump files using crash & could execute some commands & see valid information. 3. Tried to use both kexec-tools-1.101-54.el5.i386.rpm (default in RHEL5-B1-Refresh) & latest kexec-tools-1.101-69.el5.i386.rpm Tried kdump on x206 machine and this time I observed the dump size is smaller (478MB) than that of RAM size (2 GB)of the machine. However all these dumps can be opened using crash & find some information !! [root@x206f ~]# grep Mem /proc/meminfo MemTotal: 2007472 kB MemFree: 1573736 kB [root@x206f ~]# df -h --> before taking dump Filesystem Size Used Avail Use% Mounted on /dev/sda2 9.5G 5.8G 3.3G 64% / /dev/sda1 130M 66M 58M 54% /boot tmpfs 981M 0 981M 0% /dev/shm /dev/sda3 9.5G 482M 8.6G 6% /home ------------after taking dump, vmcore saved into /var/crash/ ----------------- [root@x206f ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 9.5G 6.2G 2.8G 70% / /dev/sda1 130M 66M 58M 54% /boot tmpfs 981M 0 981M 0% /dev/shm /dev/sda3 9.5G 482M 8.6G 6% /home [root@x206f 2006-09-26-15:33]# ls -lh total 478M -r-------- 1 root root 2.0G Sep 26 15:34 vmcore [root@x206f 2006-09-26-15:33]# ls -l total 488480 -r-------- 1 root root 2079392416 Sep 26 15:34 vmcore [root@x206f 2006-09-26-15:33]# du -sh vmcore 478M vmcore (In reply to comment #2) > Tried kdump on x206 machine and this time I observed the dump size is smaller > (478MB) than that of RAM size (2 GB)of the machine. > > However all these dumps can be opened using crash & find some information !! > > [root@x206f ~]# grep Mem /proc/meminfo > MemTotal: 2007472 kB > MemFree: 1573736 kB > > [root@x206f ~]# df -h --> before taking dump > Filesystem Size Used Avail Use% Mounted on > /dev/sda2 9.5G 5.8G 3.3G 64% / > /dev/sda1 130M 66M 58M 54% /boot > tmpfs 981M 0 981M 0% /dev/shm > /dev/sda3 9.5G 482M 8.6G 6% /home > > ------------after taking dump, vmcore saved into /var/crash/ ---------------- - > > [root@x206f ~]# df -h > Filesystem Size Used Avail Use% Mounted on > /dev/sda2 9.5G 6.2G 2.8G 70% / > /dev/sda1 130M 66M 58M 54% /boot > tmpfs 981M 0 981M 0% /dev/shm > /dev/sda3 9.5G 482M 8.6G 6% /home > I feel this is not a true indicator of any discrepancy in the vmcore file size since the 'ls' and 'du' command are indicating the right size. > [root@x206f 2006-09-26-15:33]# ls -lh > total 478M ^^^^^^^^^^ is the size in terms of number of blocks. > -r-------- 1 root root 2.0G Sep 26 15:34 vmcore ^^^^^^ the right size in bytes > [root@x206f 2006-09-26-15:33]# ls -l > total 488480 > -r-------- 1 root root 2079392416 Sep 26 15:34 vmcore > [root@x206f 2006-09-26-15:33]# du -sh vmcore > 478M vmcore ^^^^^^^ This is again in number of blocks You need to specify the -b option to 'du' to obtain the size in bytes. The following is what I got on the same machine : [root@x206f 2006-09-26-16:58]# du -sbh vmcore 2.0G vmcore (In reply to comment #0) Could you provide the output of 'readelf -{h,l,S}' command for the vmcore file you get? (In reply to comment #5) > (In reply to comment #0) > > Could you provide the output of 'readelf -{h,l,S}' command for the vmcore file > you get? When I ran 'readelf' on the machine on which I took the dump, got the following error : # readelf -h vmcore readelf: Error: Could not locate 'vmcore'. System error message: Value too large for defined data type But when tranferred the dump to another machine, the output of 'readelf' is : # readelf -h vmcore ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: CORE (Core file) Machine: Intel 80386 Version: 0x1 Entry point address: 0x0 Start of program headers: 52 (bytes into file) Start of section headers: 0 (bytes into file) Flags: 0x0 Size of this header: 52 (bytes) Size of program headers: 32 (bytes) Number of program headers: 6 Size of section headers: 0 (bytes) Number of section headers: 0 Section header string table index: 0 # readelf -l vmcore Elf file type is CORE (Core file) Entry point 0x0 There are 6 program headers, starting at offset 52 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align NOTE 0x0000f4 0x00000000 0x00000000 0x00290 0x00290 0 LOAD 0x000384 0xc0000000 0x00000000 0xa0000 0xa0000 RWE 0 LOAD 0x0a0384 0xc0100000 0x00100000 0xf00000 0xf00000 RWE 0 LOAD 0xfa0384 0xc5000000 0x05000000 0x33000000 0x33000000 RWE 0 LOAD 0x33fa0384 0xffffffff 0x38000000 0x9ffb0580 0x9ffb0580 RWE 0 LOAD 0xd3f50904 0xffffffff 0x00000000 0x28000000 0x28000000 RWE 0 I am working at this issue. The details of the environment and the above dump are as follows : ------Meminfo-------------- # cat /proc/meminfo | head MemTotal: 3434544 kB MemFree: 2973852 kB Buffers: 59764 kB Cached: 336072 kB SwapCached: 0 kB Active: 111656 kB Inactive: 308188 kB HighTotal: 2621120 kB HighFree: 2259064 kB LowTotal: 813424 kB ------Size of vmcore----- # ls -l vmcore -r-------- 1 root root 4227139844 Oct 5 10:26 vmcore ------On running crash------ # crash /usr/lib/debug/lib/modules/2.6.17-1.2519.4.26.el5/vmlinux vmcore crash 4.0-3.1 Copyright (C) 2002, 2003, 2004, 2005, 2006 Red Hat, Inc. Copyright (C) 2004, 2005, 2006 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005 Fujitsu Limited Copyright (C) 2005 NEC Corporation Copyright (C) 1999, 2002 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. GNU gdb 6.1 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"... KERNEL: /usr/lib/debug/lib/modules/2.6.17-1.2519.4.26.el5/vmlinux DUMPFILE: vmcore CPUS: 4 DATE: Thu Oct 5 10:25:01 2006 UPTIME: 00:24:57 LOAD AVERAGE: 0.07, 0.03, 0.01 TASKS: 120 NODENAME: llm19.in.ibm.com RELEASE: 2.6.17-1.2519.4.26.el5 VERSION: #1 SMP Mon Sep 11 17:12:50 EDT 2006 MACHINE: i686 (3400 Mhz) MEMORY: 3.4 GB PANIC: "Oops: 0000 [#1]" (check log for details) PID: 2210 COMMAND: "bash" TASK: f69e8000 [THREAD_INFO: f44f9000] CPU: 2 STATE: TASK_RUNNING (SYSRQ) (In reply to comment #6) > (In reply to comment #5) > > (In reply to comment #0) > > > > Could you provide the output of 'readelf -{h,l,S}' command for the vmcore file > > you get? > > When I ran 'readelf' on the machine on which I took the dump, got the following > error : > > # readelf -h vmcore > readelf: Error: Could not locate 'vmcore'. System error message: Value too > large for defined data type > > But when tranferred the dump to another machine, the output of 'readelf' is : > > # readelf -h vmcore > ELF Header: > Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 > Class: ELF32 > Data: 2's complement, little endian > Version: 1 (current) > OS/ABI: UNIX - System V > ABI Version: 0 > Type: CORE (Core file) > Machine: Intel 80386 > Version: 0x1 > Entry point address: 0x0 > Start of program headers: 52 (bytes into file) > Start of section headers: 0 (bytes into file) > Flags: 0x0 > Size of this header: 52 (bytes) > Size of program headers: 32 (bytes) > Number of program headers: 6 > Size of section headers: 0 (bytes) > Number of section headers: 0 > Section header string table index: 0 > > # readelf -l vmcore > > Elf file type is CORE (Core file) > Entry point 0x0 > There are 6 program headers, starting at offset 52 > > Program Headers: > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align > NOTE 0x0000f4 0x00000000 0x00000000 0x00290 0x00290 0 > LOAD 0x000384 0xc0000000 0x00000000 0xa0000 0xa0000 RWE 0 > LOAD 0x0a0384 0xc0100000 0x00100000 0xf00000 0xf00000 RWE 0 > LOAD 0xfa0384 0xc5000000 0x05000000 0x33000000 0x33000000 RWE 0 > LOAD 0x33fa0384 0xffffffff 0x38000000 0x9ffb0580 0x9ffb0580 RWE 0 > LOAD 0xd3f50904 0xffffffff 0x00000000 0x28000000 0x28000000 RWE 0 > > I am working at this issue. -----Output of /proc/iomem # cat /proc/iomem 00000000-0009d3ff : System RAM 0009d400-0009ffff : reserved 000a0000-000bffff : Video RAM area 000c0000-000c8fff : Video ROM 000c9000-000ca5ff : Adapter ROM 000f0000-000fffff : System ROM 00100000-d7fb057f : System RAM 00100000-00387b8a : Kernel code 00387b8b-004698c7 : Kernel data 01000000-04ffffff : Crash kernel d7fb0580-d7fcffff : ACPI Tables d7fd0000-d7ffffff : reserved d9000000-dcffffff : PCI Bus #04 d9000000-daffffff : PCI Bus #06 daffe000-daffefff : 0000:06:01.1 dafff000-daffffff : 0000:06:01.0 db000000-dcffffff : PCI Bus #05 dcfe0000-dcfeffff : 0000:05:01.1 dcfe0000-dcfeffff : tg3 dcff0000-dcffffff : 0000:05:01.0 dcff0000-dcffffff : tg3 dd000000-deffffff : PCI Bus #02 defe0000-defeffff : 0000:02:01.0 deff0000-deffffff : 0000:02:01.0 df000000-df0fffff : PCI Bus #04 df000000-df0fffff : PCI Bus #06 df000000-df01ffff : 0000:06:01.0 df020000-df03ffff : 0000:06:01.1 df100000-df1fffff : PCI Bus #02 df100000-df1fffff : 0000:02:01.0 df200000-df2003ff : 0000:00:1f.1 df200400-df20040f : 0000:00:1d.4 df200400-df20040f : i6300ESB timer f0000000-f7ffffff : PCI Bus #01 f0000000-f7ffffff : 0000:01:01.0 f8000000-f8ffffff : PCI Bus #01 f8000000-f800ffff : 0000:01:01.0 f8020000-f803ffff : 0000:01:01.0 fec00000-ffffffff : reserved 100000000-127ffffff : System RAM I think this is a data structure overflow case. Look at the following program header which is wrong. LOAD 0xd3f50904 0xffffffff 0x00000000 0x28000000 0x28000000 RWE 0 Actually its physical address field should have been 0x100000000. Looks like this system has got more than 4G or RAM and we are trying to generate 32bit ELF headers, that's why the issue. ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 Output from /proc/iomem, showing some chunk of RAM has been mapped at addr greater than 4G. User elf64 headers for such cases. 100000000-127ffffff : System RAM
*** Bug 210292 has been marked as a duplicate of this bug. ***
Given that /proc/vmcore is a in /proc, I'm not sure that the size of the file is going to be 100% reliable. Did you try copying the file to disk, to see if the saved version was the same size? Also, I note that you aren't using the kdump infrastructure to capture your core. Can you reproduce this using the actual kdump configuration file and initscripts? This may just be nothing more than a cosmetic error in the display of the proc file size.
closing due to inactivity.