Bug 147436

Summary: /proc memory read for thread does not match PTRACE
Product: Red Hat Enterprise Linux 4 Reporter: Jeff Johnston <jjohnstn>
Component: kernelAssignee: Nobuhiro Tachino <ntachino>
Status: CLOSED WONTFIX QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: aimamura, bennet, davej, dff, ezannoni, halligan, jbaron, nagahama, ntachino, roland, tao, tburke, tuchida
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-11-21 16:44:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 116894, 145309, 164220    
Attachments:
Description Flags
test case
none
executable none

Description Jeff Johnston 2005-02-07 22:41:43 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4)
Gecko/20030624 Netscape/7.1

Description of problem:
Customer has reported error with gdb: Bugzilla 145309

The customer has a threaded application which they debug and do a
gcore.  The gdb gcore operation reads /proc mem to output the core
file.  For the ia64, the parameter registers (starting at r32) are
stored in the backing store.  When the core file is output, the
backing storage for the given thread is wrong.  This has been
pinpointed to the read of the proc memory storage.  In one scenario,
gdb was opening /proc/xxxx/mem where xxxx was the lwp number.  This
open succeeded and the data read is incorrect.  I do not know if the
open and read should have succeeded since the lwp is not directly
visible in /proc.  I confirmed via experimenting with the gdb code
that the data returned is the same as opening /proc/yyyy/mem or
opening /proc/yyyy/task/xxxx/mem where yyyy is the main process pid
and xxxx is the lwp of the thread in question.

The data read via the /proc method does not match the data returned
via PTRACE PT_READ_I.  Data returned by PTRACE is correct.  Use of
PTRACE PT_READ_I is unacceptable for use with gcore due to the size of
the data being transmitted.

Version-Release number of selected component (if applicable):
kernel-2.6.8-1.528.2.10

How reproducible:
Always

Steps to Reproduce:
1.gdb -nw a.out
2.b 49
3.run
4.thread 2  (switches to thread 2)
5.info register bof  (beginning of current frame)
6.x/g $bof
7.gcore mycore
8.quit gdb
9.rerun gdb -nw a.out mycore
10.thread 2
11.info register bof
12.info register $bof


Note the value at $bof.  This value is different based on how gdb
locates the value.   

Actual Results:  -bash-3.00$ gdb -nw a.out
GNU gdb Red Hat Linux (6.3.0.0-0.14rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "ia64-redhat-linux-gnu"...Using host
libthread_db library "/lib/tls/libthread_db.so.1".

(gdb) b 49
Breakpoint 1 at 0x4000000000000c21: file thread.c, line 49.
(gdb) run
Starting program: /home/jjohnstn/sigsegv-problem/a.out 
[Thread debugging using libthread_db enabled]
[New Thread 2305843009213969824 (LWP 17811)]
TEST START
[New Thread 2305843009227241664 (LWP 17814)]
THREAD-A START
[New Thread 2305843009237727424 (LWP 17815)]
THREAD-B START
[Switching to Thread 2305843009237727424 (LWP 17815)]

Breakpoint 1, threadB (tname=0x4000000000000f28) at thread.c:49
49	     sleep(10);
(gdb) thread 2
[Switching to thread 2 (Thread 2305843009227241664 (LWP 17814))]#0 
0xa000000000010641 in ?? ()
(gdb) info register $bof
bof            0x20000000002ec148	2305843009216758088
(gdb) x/g $bof
0x20000000002ec148:	0x2000000000ceadf0
(gdb) gcore mycore
warning: Memory read failed for corefile section, 16384 bytes at
0x0000000000000000

Saved corefile mycore
(gdb) quit
The program is running.  Exit anyway? (y or n) y
-bash-3.00$ gdb -nw a.out mycore
GNU gdb Red Hat Linux (6.3.0.0-0.14rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "ia64-redhat-linux-gnu"...Using host
libthread_db library "/lib/tls/libthread_db.so.1".

Core was generated by `/home/jjohnstn/sigsegv-problem/a.out'.
Program terminated with signal 5, Trace/breakpoint trap.

warning: svr4_current_sos: Can't read pathname for load map:
Input/output error

Reading symbols from /lib/tls/libpthread.so.0...done.
Loaded symbols for /lib/tls/libpthread.so.0
Reading symbols from /lib/tls/libc.so.6.1...done.
Loaded symbols for /lib/tls/libc.so.6.1
Reading symbols from /lib/ld-linux-ia64.so.2...done.
Loaded symbols for /lib/ld-linux-ia64.so.2
#0  threadB (tname=0x4000000000000f28) at thread.c:49
49	     sleep(10);
(gdb) thread 2
[Switching to thread 2 (process 17820)]#0  0xa000000000010641 in ?? ()
(gdb) info register bof
bof            0x20000000002ec148	2305843009216758088
(gdb) x/g $bof
0x20000000002ec148:	0x4000000000000f72

Note how we don't display the same value in $bof as the running program.

Expected Results:  The data at $bof should be consistent between the
live process and the corefile.  The core file is wrong because
linux_proc_xfer_memory read incorrect data from the location at $bof.
 The correct live value was read via PTRACE PT_READ_I.

Comment 1 Jeff Johnston 2005-02-07 22:42:57 UTC
Created attachment 110760 [details]
test case

Comment 2 Jeff Johnston 2005-02-07 22:44:16 UTC
Created attachment 110761 [details]
executable

Comment 4 linux-psi@ml.soft.fujitsu.com 2005-02-14 04:48:48 UTC
Add Fujitsu Japan Support team

Comment 8 L3support 2005-02-21 09:14:45 UTC
Add L3-support

Comment 12 JoAnne K. Halligan 2005-03-02 21:50:47 UTC
This issue did not make U1 due to other higher priority issues. If the Fujitsu
team has a proposed resolution, they are welcome to make that recommendation or
submit any proposed fix before the U2 cut off.

Comment 28 Jason Baron 2005-04-06 20:52:34 UTC
where can i get the gdb version used in comment #1? The current rhel4 gdb is
segfaulting in step 10.

Comment 64 Daniel Riek 2006-11-21 16:31:24 UTC
PM NAK as the underlying partner request was closed and is considered fixed.

Comment 65 RHEL Program Management 2006-11-21 16:44:02 UTC
Product Management has reviewed and declined this request.  You may appeal this
decision by reopening this request.