Bug 222701 - backtrace failed with "Cannot access memory" error when debugging large core file.
Summary: backtrace failed with "Cannot access memory" error when debugging large core ...
Keywords:
Status: CLOSED DUPLICATE of bug 224243
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.4
Hardware: i386
OS: Linux
medium
high
Target Milestone: ---
: ---
Assignee: Red Hat Kernel Manager
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-01-15 20:13 UTC by Ryuji Hironaga
Modified: 2007-11-17 01:14 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-02-05 22:25:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Ryuji Hironaga 2007-01-15 20:13:07 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)

Description of problem:
When I run my program and the program failed to malloc, and I got a core file by abort() at that time. Then I tried to get backtrace using gdb and hit the problem.

I can reproduce the problem using simple program.

I found the similar problem in errata, but the gdb already has the fix.
http://www.jp.redhat.com/support/errata/RHBA/RHBA-2006-0429J.html

---
$ gdb -v
GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu".
$ rpm -qa | grep gdb
gdbm-1.8.0-24
gdbm-devel-1.8.0-24
gdb-6.3.0.0-1.132.EL4


Version-Release number of selected component (if applicable):
gdb-6.3.0.0-1.132.EL4

How reproducible:
Always


Steps to Reproduce:
1. Compile the following program
$ cat t.c
#include <stdlib.h>

int
main()
{
        for (;;) {
                if (malloc(4096) == NULL)
                        abort();
        }
}
$ cc -g -o t t.c

2. Run the program
$ ./t 
Aborted (core dumped)
$ ls -l
total 2096180
-rw-------  1 xxxx xxxx 2148257792 Jan 15 14:44 core.1168890218.5018
-rwxrwxr-x  1 xxxx xxxx       5926 Jan 15 14:43 t
-rw-rw-r--  1 xxxx xxxx        140 Jan 15 14:36 t.c

3. Try to show backtrace


Actual Results:
$ gdb -q ./t core.1168890218.5018
Using host libthread_db library "/lib/tls/libthread_db.so.1".

warning: exec file is newer than core file.
Core was generated by `./t'.
Program terminated with signal 6, Aborted.
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0  0x0084e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) bt
#0  0x0084e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
Cannot access memory at address 0xbff057a4

The address is seems to be esp.

(gdb) info reg
eax            0x0      0
ecx            0x139a   5018
edx            0x6      6
ebx            0x139a   5018
esp            0xbff057a4       0xbff057a4
ebp            0xbff057b8       0xbff057b8
esi            0x139a   5018
edi            0x992ff4 10039284
eip            0x84e7a2 0x84e7a2
eflags         0x246    582
cs             0x73     115
ss             0x7b     123
ds             0xc02d007b       -1070792581
es             0x7b     123
fs             0x0      0
gs             0x33     51


Expected Results:
gdb shows correct backtrace.

Additional info:
1. I tried latest gdb (gdb-6.6.tar.gz) but I got same result.

2. When I attached the running process, I can display the backtrace.

$ gdb -q ./t
Using host libthread_db library "/lib/tls/libthread_db.so.1".
(gdb) run
Starting program: /home/hydragui/cc/t

Program received signal SIGABRT, Aborted.
0x0084e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) bt
#0  0x0084e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x008937a5 in raise () from /lib/tls/libc.so.6
#2  0x00895209 in abort () from /lib/tls/libc.so.6
#3  0x080483d5 in main () at t.c:8
(gdb)

3. strace shows gdb tries to read above core file size.

$ strace gdb -q ./t core.1168890218.5018
... snip ...
open("/home/hydragui/cc/core.1168890218.5018", O_RDONLY|O_LARGEFILE) = 5
fstat64(5, {st_mode=S_IFREG|0600, st_size=2148257792, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7d34000
_llseek(5, 8392704, [8392704], SEEK_SET) = 0
read(5, "\370D\206\0\3\0\0\0PF\206\0\0\0\0\0PF\206\0\0\0\0\0\0\0"..., 468) = 468
_llseek(5, 3215839232, [3215839232], SEEK_SET) = 0
read(5, "", 4096)                       = 0
_llseek(5, 1956, [3215841188], SEEK_CUR) = 0
read(5, "", 4096)                       = 0
write(2, "Cannot access memory at address "..., 42Cannot access memory at address 0xbff057a4) = 42

Comment 1 Jan Kratochvil 2007-01-23 16:07:48 UTC
This is a kernel problem, you can see the core file has been limited to 2GB
while the i386 process address space is 3GB.


Comment 4 Ryuji Hironaga 2007-01-24 23:06:43 UTC
The problem seems to be resolved by the follwoing change to the kernel.

-- linux-2.6.9/fs/binfmt_elf.c.orig	2007-01-24 12:10:19.000000000 -0500
+++ linux-2.6.9/fs/binfmt_elf.c	2007-01-24 12:13:09.000000000 -0500
@@ -1146,7 +1146,7 @@
 	return file->f_op->write(file, addr, nr, &file->f_pos) == nr;
 }
 
-static int dump_seek(struct file *file, off_t off)
+static int dump_seek(struct file *file, loff_t off)
 {
 	if (file->f_op->llseek) {
 		if (file->f_op->llseek(file, off, 0) != off)


When core file size exceeds 2GB, dump_seek() always returns 0 even if llseek 
succeeds.


Comment 5 Linda Wang 2007-02-05 22:25:57 UTC

*** This bug has been marked as a duplicate of 224243 ***


Note You need to log in before you can comment on or make changes to this bug.