Bug 185591 - gdb corrupt stack when attatching to process compiled with -m32 on x86_64
Summary: gdb corrupt stack when attatching to process compiled with -m32 on x86_64
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Fedora Legacy
Classification: Retired
Component: gdb
Version: fc3
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
Assignee: Fedora Legacy Bugs
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-03-15 23:52 UTC by tom huang
Modified: 2007-04-18 17:39 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-03-17 19:28:17 UTC
Embargoed:


Attachments (Terms of Use)
Test on another x86_64 (1.67 KB, text/plain)
2006-03-16 22:24 UTC, David Eisenstein
no flags Details
execution log with kernel-2.6.9-22.0.2.EL and gdb-6.3.0.0-0.30.1 (1.86 KB, text/plain)
2006-03-17 21:27 UTC, tom huang
no flags Details

Description tom huang 2006-03-15 23:52:23 UTC
Description of problem:
gdb stack is corrupted when attatching to a running process compiled with -m32 
on a DELL PE2850 (x86_64). Things are fine when -m32 is not supplied.

Version-Release number of selected component (if applicable):
kernel-2.6.12-1.1381_FC3smp
gdb-6.1post-1.20040607.43.0.1
glibc-2.3.6-0.fc3.1
gcc-3.4.4-2.fc3

How reproducible:
with following test program test.c:

#include <stdio.h>
main()
{
        int     i=0;
        while(++i)
        {
                printf("%d\n", i);
                sleep(1);
        }
}

Steps to Reproduce:
1. gcc -g -m32 test.c
2. ./a.out
3. attatch gdb to the running a.out
  
Actual results:
snap@tp105:/usr/snap 220 % gdb - 31414
GNU gdb Red Hat Linux (6.1post-1.20040607.43.0.1rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...-: No such file or 
directory.

Attaching to process 31414
Reading symbols from /usr/snap/tmp/test/a.out...done.
Using host libthread_db library "/lib64/tls/libthread_db.so.1".
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
0xffffe410 in ?? ()
(gdb) bt
#0  0xffffe410 in ?? ()
#1  0xffffc208 in ?? ()
#2  0xf7fceff4 in ?? () from /lib/tls/libc.so.6
#3  0xffffc064 in ?? ()
#4  0xf7f32590 in __nanosleep_nocancel () from /lib/tls/libc.so.6
#5  0xf7f323bc in sleep () from /lib/tls/libc.so.6
Previous frame inner to this frame (corrupt stack?)
(gdb)

Expected results:

GNU gdb Red Hat Linux (6.1post-1.20040607.43.0.1rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...-: No such file or 
directory.

Attaching to process 31456
Reading symbols from /usr/snap/tmp/test/a.out...done.
Using host libthread_db library "/lib64/tls/libthread_db.so.1".
Reading symbols from /lib64/tls/libc.so.6...done.
Loaded symbols for /lib64/tls/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
0x00002aaaaac4fd32 in __nanosleep_nocancel () from /lib64/tls/libc.so.6
(gdb) bt
#0  0x00002aaaaac4fd32 in __nanosleep_nocancel () from /lib64/tls/libc.so.6
#1  0x00002aaaaac4fbd0 in sleep () from /lib64/tls/libc.so.6
#2  0x0000000000400534 in main () at test.c:10
(gdb)

Additional info:

The expected result is produced by attatching gdb to a.out which is compiled 
without -m32.

Comment 1 Pekka Savola 2006-03-16 05:40:37 UTC
Does the problem exist in gdb's shipped in later FC releases as well?  Could you
test the latest rawhide and/or the upcoming FC5?



Comment 2 tom huang 2006-03-16 17:47:17 UTC
(In reply to comment #1)
> Does the problem exist in gdb's shipped in later FC releases as well?  Could 
you
> test the latest rawhide and/or the upcoming FC5?

I have tested with gdb-6.3.0.0-1.84.x86_64. The stack is still currupted. 
However, there is a warning about VSYSCALL page. No idea how to address it. I 
have also tested with the non-smp version of the kernel-2.6.12-1.1381_FC3. The 
result is the same. I could not test with gdb shipped with FC5 as it requires 
glibc-2.4. Following is the result of gdb-6.3.0.0-1.84.x86_64

snap@tp105:/usr/snap/tmp/test 354 % gdb - 4604
GNU gdb Red Hat Linux (6.3.0.0-1.84rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...-: No such file or 
directory.

Attaching to process 4604

warning: The current VSYSCALL page code requires an existing execuitable.
Use "add-symbol-file-from-memory" to load the VSYSCALL page by hand
Reading symbols from /usr/snap/tmp/test/a.out...done.
Using host libthread_db library "/lib64/tls/libthread_db.so.1".
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
0xffffe410 in ?? ()
snap@tp105:/usr/snap/tmp/test 354 % gdb - 4604
GNU gdb Red Hat Linux (6.3.0.0-1.84rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...-: No such file or 
directory.

Attaching to process 4604

warning: The current VSYSCALL page code requires an existing execuitable.
Use "add-symbol-file-from-memory" to load the VSYSCALL page by hand
Reading symbols from /usr/snap/tmp/test/a.out...done.
Using host libthread_db library "/lib64/tls/libthread_db.so.1".
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
0xffffe410 in ?? ()
(gdb) bt
#0  0xffffe410 in ?? ()
#1  0xffffcee8 in ?? ()
#2  0xf7fceff4 in ?? () from /lib/tls/libc.so.6
#3  0xffffcd44 in ?? ()
#4  0xf7f32590 in __nanosleep_nocancel () from /lib/tls/libc.so.6
#5  0xf7f323bc in sleep () from /lib/tls/libc.so.6
Previous frame inner to this frame (corrupt stack?)
(gdb) help add-symbol-file-from-memory
Load the symbols out of memory from a dynamically loaded object file.
Give an expression for the address of the file's shared object file header.
(gdb)

Comment 3 David Eisenstein 2006-03-16 22:24:47 UTC
Created attachment 126256 [details]
Test on another x86_64

I cannot reproduce the error you see.  I tried your test program on an x86_64
system in a chrooted FC3 environment.  See the attachment for the results.

A few questions
  * Are your system's packages updated to the most recent versions?
  * Are you running in any kind of SElinux environment?  If so, have you
    tried turning it off?
  * What is the setting of your kernel vdso variable, that is 
    /proc/sys/kernel/vdso?  If it's not 0, I am wondering if you 
      # echo 0 >/proc/sys/kernel/vdso
    if it will make a difference?

This might be both a gdb and a kernel problem.	There are at least a couple
of similar Bugzilla bugs that were worked on for RHEL 4:
   * Bug 146087 - Can't debug 32 bit apps running on x86_64
   * Bug 146803 - 32bit gdb doesn't work on x84_64

Bear in mind that the kernel in question for Bug 146803 is 2.6.9-something...

Comment 4 tom huang 2006-03-16 23:56:49 UTC
(In reply to comment #3)
Your result doesn't look right either, although not identical to what I see. I 
would expect somthing showing function name with line numbers for bt, such as
(gdb) bt
#0  0x00002aaaaac4fd32 in __nanosleep_nocancel () from /lib64/tls/libc.so.6
#1  0x00002aaaaac4fbd0 in sleep () from /lib64/tls/libc.so.6
#2  0x0000000000400534 in main () at test.c:10
(gdb)

Regarding to your questions:
1. I have updated all possible components from the most recent FC3 updates, 
such as kernel, gdb, gcc, glibc. What else do I miss?
2. SElinux is off.
3. There is no /proc/sys/kernel/vdso on my computer. I cannot create one even 
as root - the file system is not writable. How may I get it defined?

A question:
This bug looks similar to Bug 146087 and Bug 166083 that seem having been 
fixed. Why does it still show up in the latest FC3 release?


Comment 5 David Eisenstein 2006-03-17 19:28:17 UTC
Regarding your point 3) in comment 4:

My bad, I think only the FC2 kernels have /proc/sys/kernel/vdso -- maybe the
original FC3 kernels (2.6.9) had it too.  But the latest FC3 kernel as you 
know is 2.6.12, so the kernel developers must have done away with it.

It is possible that the kernel portion of this problem was fixed in FC3's
upgrade to 2.6.12.  But I rather don't believe it was.  The excerpt from my
log file is *really* running on an RHEL 4 (Centos 4, actually) kernel
(kernel-2.6.9-22.0.2.EL), the latest RHEL 4 kernel, which includes fixes for
this very problem.  It may be the kernel, because I'm not getting the error
you get when I run it, and as far as I can tell, we're both using the latest
FC3 gdb (x86_64) binaries.

The GDB portion?  Well, the bug was reported at the time RHEL 4 was a release
candidate (Feb. '05), and it looks like it wasn't officially fixed until a
few months later, in May.  (http://rhn.redhat.com/errata/RHBA-2005-187.html).

> Why does it still show up in the latest FC3 release?

I don't know.  Perhaps because no bug was filed for this problem against
FC3 at the time.

The Fedora Legacy Project focuses on security problems, so we really don't
have a lot of resources to invest in fixing this for FC3.  We're up to our
ears in fixing security-related problems.

So a couple things I might suggest -- (1) you may want to try RHEL 4's gdb
to see if that may ameliorate the problem ... if you want to try and build
that version of gdb from source ... as it should at any rate have the gdb
fixes.  It may also be available from an alternative open source supplier
in binary form -- but compiling from source on the system you are currently 
running will guarantee compatibility with your FC3 libraries and other
applications.

Or (2), see if you can find out if there is a patch available for the 2.6.12
kernel for this issue.  The patch mentioned for the kernel in Bug 146803
is for version 2.6.9 kernel (the RHEL 4 kernel), and I don't believe it
will apply to the 2.6.12 kernel.  If you do find such a patch, please do
let us know in this or a new bug report (you can reopen this one if you
want)-- for we can put it in the queue for including in FC3's kernel next
time we do an update to it for security issues.

Or (3), you could try building and installing RHEL 4's kernel-2.6.9-
22.0.2.EL on FC3, and see if that helps.

Hope this helps.  Sorry we can't do more.  I am closing this bug CANTFIX
for now, but you are welcome to reopen it if you find something that can
help us solve this problem that we haven't found.  Thanks.

Comment 6 tom huang 2006-03-17 21:27:33 UTC
Created attachment 126290 [details]
execution log with kernel-2.6.9-22.0.2.EL and gdb-6.3.0.0-0.30.1

(In reply to comment #5)
I got the kernel and gdb as you suggested.
kernel-2.6.9-22.0.2.EL
gdb-6.3.0.0-0.30.1

However, the problem does not go away - see attatchment. It has the same
symptom as BUG 146087, i.e., you get the correct backtrace after a few stepi. I
understand you are not fixing FC3 kernel. But PLEASE help find a working
kernel/gdb. Thanks!!!

Comment 7 David Eisenstein 2006-03-18 04:19:31 UTC
I'd get the latest rhel 4 gdb:

http://mirrors.kernel.org/redhat/redhat/linux/updates/enterprise/4AS/en/os/SRPMS/gdb-6.3.0.0-1.96.src.rpm

Build it and give it a try.

Comment 8 tom huang 2006-03-25 15:23:46 UTC
(In reply to comment #7)
Thanks for your information. However, there is no luck! I also have tried 
CentOS4.3 and the bug is also there. It seems to me that this bug has not been 
fixed properly. I am going to file a bug report to CentOS. In the meantime, I 
would appreciate if you could let me know if you hear any news in this regard. 
Thanks.


Note You need to log in before you can comment on or make changes to this bug.