Bug 231090 - LSPP: getattr causes python Segfault
Summary: LSPP: getattr causes python Segfault
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: python
Version: 5.0
Hardware: ppc64
OS: Linux
urgent
medium
Target Milestone: ---
: ---
Assignee: Miloslav Trmač
QA Contact: Brock Organ
URL:
Whiteboard:
Depends On:
Blocks: 234654
TreeView+ depends on / blocked
 
Reported: 2007-03-06 01:45 UTC by Kylene J Hall
Modified: 2007-11-30 22:07 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-04-11 00:33:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Test case (2.45 KB, application/x-tar)
2007-03-06 01:45 UTC, Kylene J Hall
no flags Details
partial strace output. (22.41 KB, text/plain)
2007-03-06 17:24 UTC, Kylene J Hall
no flags Details
strace output on s390x (47.63 KB, text/plain)
2007-04-05 17:10 UTC, Klaus Kiwi (Old account no longer used)
no flags Details
v2 of the testcase (2.57 KB, application/gzip)
2007-04-05 17:23 UTC, Klaus Kiwi (Old account no longer used)
no flags Details
Fix semctl usage (1.10 KB, patch)
2007-04-11 00:28 UTC, Miloslav Trmač
no flags Details | Diff
Don't reference uninitialized variables in the Python code (797 bytes, patch)
2007-04-11 00:31 UTC, Miloslav Trmač
no flags Details | Diff

Description Kylene J Hall 2007-03-06 01:45:50 UTC
Description of problem:
I have no idea what component to open this bug against.  We have a test case
that uses python to call c functions that call getattr and setattr on a
semaphore.  The setattr works perfectly and the getattr  chokes.  I am including
a test case below that doesn't use our framework but approximates our
environment.  The functions that call the getattr and setattr functions are
identical except parts that will make the function Segfault even sooner are
commented out in the getattr case and labeled with KEY LINE or KEY PARTS.

Version-Release number of selected component (if applicable):
python-2.4.3-19.el5

How reproducible:
Always on PPC64

Steps to Reproduce:
1. untar the attached test
2. make
3. run with ./test.py
  
Actual results:
Setattr tests work fine.
Getattr tests segfault currently at the end of the function but uncommenting any
of the KEY areas will cause it to segfault sooner (seems to be setting any
variables, was originally calling another function and never got into the other
function).

Expected results:
Neither setattr nor getattr tests should segfault (with everything uncommented)

Additional info:

Comment 1 Kylene J Hall 2007-03-06 01:45:50 UTC
Created attachment 149313 [details]
Test case

Comment 2 Jeremy Katz 2007-03-06 02:04:28 UTC
Do you have a backtrace from the segfault and an strace of when it occurs?

Comment 3 Kylene J Hall 2007-03-06 17:22:27 UTC
Here is the backtrace I have so far (I am going to rebuild the python rpm to get
an unstripped version and try again with that).  The Strace output is attached.
 Strace won't complete and can't be Ctrl-C out of.  It is running 96% CPU bound.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 268380176 (LWP 1883)]
0x0fb73ad4 in PyCode_Addr2Line () from /usr/lib/libpython2.4.so.1.0
(gdb) bt
#0  0x0fb73ad4 in PyCode_Addr2Line () from /usr/lib/libpython2.4.so.1.0
#1  0x0fb9bb98 in PyTraceBack_Here () from /usr/lib/libpython2.4.so.1.0
#2  0x0fb6d514 in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#3  0x0fb70ce0 in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#4  0x0fb727fc in PyEval_EvalCodeEx () from /usr/lib/libpython2.4.so.1.0
#5  0x0fb728e4 in PyEval_EvalCode () from /usr/lib/libpython2.4.so.1.0
#6  0x0fb952fc in Py_CompileString () from /usr/lib/libpython2.4.so.1.0
#7  0x0fb96dac in PyRun_SimpleFileExFlags () from /usr/lib/libpython2.4.so.1.0
#8  0x0fb97600 in PyRun_AnyFileExFlags () from /usr/lib/libpython2.4.so.1.0
#9  0x0fb9ec7c in Py_Main () from /usr/lib/libpython2.4.so.1.0
#10 0x100016b4 in main ()


Comment 4 Kylene J Hall 2007-03-06 17:24:38 UTC
Created attachment 149360 [details]
partial strace output.  

See previous comment for details about the strace output.

Comment 5 Kylene J Hall 2007-03-06 21:11:18 UTC
I was not able to get any more info from a rebuilt python because I get this
error when I attempt to run my code built with the rebuilt python: Traceback
(most recent call last):
  File "./test.py", line 2, in ?
    import c_test
  File "/root/test/c_test.py", line 5, in ?
    import _c_test
ImportError: /root/test/_c_test.so: R_PPC_REL24 relocation at 0x0fc04e38 for
symbol `strlen' out of range


python was rebuilt from the src.rpm with rpmbuild -bb --target=ppc python.spec

Comment 6 Kylene J Hall 2007-03-08 16:57:02 UTC
Any updates?

Comment 7 Kylene J Hall 2007-03-12 20:29:48 UTC
[root/abat_r/SystemLow@hvracer3 mls]# rpm -V python
[root/abat_r/SystemLow@hvracer3 mls]# echo $?
0
[root/abat_r/SystemLow@hvracer3 mls]# rpm -V attr
[root/abat_r/SystemLow@hvracer3 mls]# echo $?
0
[root/abat_r/SystemLow@hvracer3 mls]# rpm -V libattr
[root/abat_r/SystemLow@hvracer3 mls]# echo $?
0


Comment 8 Kylene J Hall 2007-03-19 23:13:53 UTC
This was not fixed by updating to the latest lspp packages including the lspp.69
kernel.

Comment 9 Irina Boverman 2007-03-21 18:46:09 UTC
Per 3/20/2007 meeting with IBM, this bug is urgent, since it blocks IBM's
testing effort. 

Comment 10 George C. Wilson 2007-03-26 20:30:39 UTC
Can this be retested on the 70 kernel?

Comment 11 George C. Wilson 2007-04-02 20:17:25 UTC
klausk, please take a look at this.

Comment 12 Klaus Kiwi (Old account no longer used) 2007-04-02 20:51:37 UTC
Don't know if this means this is fixed or not, I'll take a look in the testcase
later.

this is what I'm getting using the testcase in comment #1:
[root/sysadm_r/SystemLow@js21racer1 test]# make
gcc -o _test.o -c _test.c
swig -python _test.i
gcc -o _test_wrap.o -c -I/usr/include/python2.4 _test_wrap.c
gcc -shared -o _c_test.so _test_wrap.o _test.o
[root/sysadm_r/SystemLow@js21racer1 test]# ls
c_test.py  _c_test.so  Makefile  _test.c  _test.h  _test.i  _test.o  test.py 
_test_wrap.c  _test_wrap.o
[root/sysadm_r/SystemLow@js21racer1 test]# python test.py
Traceback (most recent call last):
  File "test.py", line 2, in ?
    import c_test
  File "/root/test/c_test.py", line 5, in ?
    import _c_test
ImportError: /root/test/_c_test.so: cannot restore segment prot after reloc:
Permission denied
[root/sysadm_r/SystemLow@js21racer1 test]#  

Comment 13 Klaus Kiwi (Old account no longer used) 2007-04-02 21:53:19 UTC
I thought I was doing something wrong (like compiling a 64 bit library for a
32-bit python), but I double checked and everything seems ok.

I get an AVC when trying to execute the testcase in enforcing mode:
type=AVC msg=audit(1175549201.361:456): avc:  denied  { execmod } for  pid=3527
comm="python" name="_c_test.so" dev=dm-0 ino=360692
scontext=abat_u:abat_r:abat_t:s0-s15:c0.c1023
tcontext=abat_u:object_r:sysadm_home_dir_t:s0 tclass=file
type=SYSCALL msg=audit(1175549201.361:456): arch=14 syscall=125 success=no
exit=-13 a0=f980000 a1=10000 a2=5 a3=ffd0054 items=0 ppid=3319 pid=3527 auid=503
uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts1 comm="python"
exe="/usr/bin/python" subj=abat_u:abat_r:abat_t:s0-s15:c0.c1023 key=(null)
type=AVC_PATH msg=audit(1175549201.361:456):  path="/root/test/_c_test.so"


*and* I get a segfault when trying the same thing in permissive mode:
[root/abat_r/SystemLow@js21racer1 test]# ./test.py
The following works:

Result:  0
Error:  0
Syscall:  117
Arg[0]:  3 1
Arg[1]:  65538 1
Arg[2]:  0 1
Arg[3]:  257 1
In local process_native_result
FLAG:  0
RECREATE:  False
OP_PERMITTED
FLAG:  101
DESC:  Passed
Audit arg setup
Before gen_audit_sys_record_tuple 0
After gen_audit_sys_record_tuple for success

The following doesn't work:

Result:  0
Error:  0
Syscall:  117
Arg[0]:  3 1
Arg[1]:  65538 1
Arg[2]:  0 1
Arg[3]:  258 1
In local process_native_result
FLAG:  0
OP_PERMITTED
Audit arg setup
Before gen_audit_sys_record_tuple 0
After gen_audit_sys_record_tuple for success
Segmentation fault


---same thing inside a gdb session--sorry no python dbg symbols available-----
[root/abat_r/SystemLow@js21racer1 test]# make clean
rm _test_wrap.c test.pyc _test.o _test_wrap.o c_test.py c_test.pyc _c_test.so
rm: cannot remove `test.pyc': No such file or directory
make: *** [clean] Error 1
[root/abat_r/SystemLow@js21racer1 test]# ./test.py
[root/abat_r/SystemLow@js21racer1 test]# CFLAGS='-D__PPC64 -D__PPC -m32 -g' make
gcc -Wall -D__PPC64 -D__PPC -m32 -g -o _test.o -c _test.c
swig -python -Wall _test.i
gcc -Wall -o _test_wrap.o -D__PPC64 -D__PPC -m32 -g -c -I/usr/include/python2.4
_test_wrap.c
gcc -Wall -shared -D__PPC64 -D__PPC -m32 -g -o _c_test.so _test_wrap.o _test.o
[root/abat_r/SystemLow@js21racer1 test]# gdb python
GNU gdb Red Hat Linux (6.5-16.el5rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "ppc64-redhat-linux-gnu"...(no debugging symbols found)
Using host libthread_db library "/lib64/libthread_db.so.1".

(gdb) run test.py
Starting program: /usr/bin/python test.py
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
[Thread debugging using libthread_db enabled]
[New Thread 268380544 (LWP 3575)]
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
The following works:

Result:  0
Error:  0
Syscall:  117
Arg[0]:  3 1
Arg[1]:  98307 1
Arg[2]:  0 1
Arg[3]:  257 1
In local process_native_result
FLAG:  0
RECREATE:  False
OP_PERMITTED
FLAG:  101
DESC:  Passed
Audit arg setup
Before gen_audit_sys_record_tuple 0
After gen_audit_sys_record_tuple for success

The following doesn't work:

Result:  0
Error:  0
Syscall:  117
Arg[0]:  3 1
Arg[1]:  98307 1
Arg[2]:  0 1
Arg[3]:  258 1
In local process_native_result
FLAG:  0
OP_PERMITTED
Audit arg setup
Before gen_audit_sys_record_tuple 0
After gen_audit_sys_record_tuple for success

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 268380544 (LWP 3575)]
0x0fb63ad4 in PyCode_Addr2Line () from /usr/lib/libpython2.4.so.1.0
(gdb) bt
#0  0x0fb63ad4 in PyCode_Addr2Line () from /usr/lib/libpython2.4.so.1.0
#1  0x0fb8bb98 in PyTraceBack_Here () from /usr/lib/libpython2.4.so.1.0
#2  0x0fb5d514 in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#3  0x0fb60ce0 in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#4  0x0fb627fc in PyEval_EvalCodeEx () from /usr/lib/libpython2.4.so.1.0
#5  0x0fb628e4 in PyEval_EvalCode () from /usr/lib/libpython2.4.so.1.0
#6  0x0fb852fc in Py_CompileString () from /usr/lib/libpython2.4.so.1.0
#7  0x0fb86dac in PyRun_SimpleFileExFlags () from /usr/lib/libpython2.4.so.1.0
#8  0x0fb87600 in PyRun_AnyFileExFlags () from /usr/lib/libpython2.4.so.1.0
#9  0x0fb8ec7c in Py_Main () from /usr/lib/libpython2.4.so.1.0
#10 0x100016b4 in main ()
(gdb)  

Comment 14 Klaus Kiwi (Old account no longer used) 2007-04-03 20:45:56 UTC
This is blocking lspp evaluation effort. Can we have the -debug package so we
can provide better understanding of this segfault?

RH, could you please raise this BZ's severity? I would also like to be assigned
as the owner for this bug, as Kylene won't be able to following for some time.

Thanks

Comment 15 Steve Grubb 2007-04-03 21:21:09 UTC
I suspect the test case has some problem. I can't get it to run on i686 correctly:

OP_PERMITTED
Audit arg setup
Before gen_audit_sys_record_tuple 0
After gen_audit_sys_record_tuple for success
Traceback (most recent call last):
  File "./test.py", line 147, in ?
    getattr_sem(id)
  File "./test.py", line 75, in getattr_sem
    return (op_flag, description) #, audit_rec)
UnboundLocalError: local variable 'description' referenced before assignment


As for the AVC, this program is doing something naughty and has to be labeled
for selinux to allow it to run the way it wants to: 

chcon -t textrel_shlib_t   _c_test.so

I cannot change who reported the bug. Hope this helps.

Comment 16 Klaus Kiwi (Old account no longer used) 2007-04-05 17:08:49 UTC
Steve,

 I don't have a i686 here, but I'm seeing the same similar behavior in a s390x
(see attachment below)

Comment 17 Klaus Kiwi (Old account no longer used) 2007-04-05 17:10:02 UTC
Created attachment 151784 [details]
strace output on s390x

Comment 18 Klaus Kiwi (Old account no longer used) 2007-04-05 17:23:06 UTC
Created attachment 151787 [details]
v2 of the testcase

Please check if this addresses the problem you were seeing

Comment 19 Klaus Kiwi (Old account no longer used) 2007-04-05 17:29:06 UTC
with the python-debuginfo package (ppc64, testcase v2):

(gdb) run test.py
Starting program: /usr/bin/python test.py
(no debugging symbols found)
[Thread debugging using libthread_db enabled]
[New Thread 268380544 (LWP 26206)]
The following works:

Result:  0
Error:  0
Syscall:  117
Arg[0]:  3 1
Arg[1]:  2818051 1
Arg[2]:  0 1
Arg[3]:  257 1
In local process_native_result
FLAG:  0
RECREATE:  False
OP_PERMITTED
FLAG:  101
DESC:  Passed
Audit arg setup
Before gen_audit_sys_record_tuple 0
After gen_audit_sys_record_tuple for success

The following doesn't work:

Result:  0
Error:  0
Syscall:  117
Arg[0]:  3 1
Arg[1]:  2818051 1
Arg[2]:  0 1
Arg[3]:  258 1
In local process_native_result
FLAG:  0
OP_PERMITTED
Audit arg setup
Before gen_audit_sys_record_tuple 0
After gen_audit_sys_record_tuple for success

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 268380544 (LWP 26206)]
PyEval_EvalFrame (f=0x10058540) at Python/ceval.c:2867
2867            if (frame->f_exc_type != NULL) {
(gdb) bt
#0  PyEval_EvalFrame (f=0x10058540) at Python/ceval.c:2867
#1  0x24222422 in ?? ()
#2  0x0fb427fc in PyEval_EvalCodeEx (co=0xf7f4f2e0, globals=<value optimized
out>, locals=<value optimized out>, args=0x0,
    argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at
Python/ceval.c:2736
#3  0x84222428 in ?? ()
#4  0x0fb428e4 in PyEval_EvalCode (co=0xf7f4ec60, globals=0xfbd8ebc,
locals=0x2d) at Python/ceval.c:484
#5  0x0fb652fc in run_node (n=<value optimized out>, filename=<value optimized
out>, globals=0xfbd8ebc, locals=0x2d,
    flags=<value optimized out>) at Python/pythonrun.c:1265
#6  0x0fb66dac in PyRun_SimpleFileExFlags (fp=<value optimized out>,
filename=0xffabfc3c "test.py", closeit=1, flags=0xffabf818)
    at Python/pythonrun.c:860
#7  0x0fb67600 in PyRun_AnyFileExFlags (fp=0xf7f4ec60, filename=0xfbd8ebc "",
closeit=1, flags=0x6) at Python/pythonrun.c:664
#8  0x0fb6ec7c in Py_Main (argc=2, argv=0xffabfb04) at Modules/main.c:493
#9  0x22000422 in ?? ()
#10 0x100016b4 in main ()
(gdb)   

Thanks,

 Klaus K.   

Comment 20 George C. Wilson 2007-04-09 20:27:24 UTC
sgrubb says: haven't been able to reproduce this one. One problem is that it
crashes on i386, but that is not a platform in the TOE. Hard to tell if this is
a kernel, python, or testcase bug. Crashes in a slightly different way on arch
it is supposed to work on.

Klaus Kiwi, please provide an strace. IBM, can you help look at this?

Linda Knippers: I tried v2 of the test on i386 and it worked. Can you try again
with that.

Comment 21 Miloslav Trmač 2007-04-09 22:03:09 UTC
The library should be compiled with -fPIC, then textrel_shlib_t won't be necessary.

Comment 22 George C. Wilson 2007-04-10 15:42:59 UTC
Klaus Kiwi, the thought is that this is still a testcase problem. Nobody else is
reporting this problem on any other problem. Can somebody you or somebody from
the IBM side look at this? Can you check on #ppc64?

Red Hat, please tell use specifically what is questionable about the testcase.
We want to help.

Comment 23 Klaus Kiwi (Old account no longer used) 2007-04-10 18:44:42 UTC
(forgive me if the below sounds foolish):

1) Why the header file /usr/include/asm-generic/ipc.h is different between
x86_64/i386 and ppc archs?

2) Why can't I include the i386 <asm-generic/ipc.h> without compile errors?



Comment 24 Klaus Kiwi (Old account no longer used) 2007-04-10 18:48:29 UTC
I'm trying to have this code compile in my local i386 code without success

first thing is that I can't include the <asm-generic/ipc.h> file, or I will run
into errors.

Things get worse when I try to define the symbols used in asm-generic/ipc.h
myself in my _test.h file (mimicked from the ipc.h header) and compile: swig
generated a code with many errors in this case

Can anyone help me getting this testcase run on i386?

Comment 25 Linda Knippers 2007-04-10 18:56:48 UTC
klausk, I built and ran the v2 test on an i386 system without any
problems, at least I think I did.  I typed 'make' and then
ran test.py as below.   I'm not sure what the results should
look like.  Are you seeing something different?

[root/lspp_test_r/SystemLow@kipper test]# make
gcc  -Wall -o _test.o -c _test.c
swig -Wall -python _test.i
gcc  -Wall -o _test_wrap.o -c -I/usr/include/python2.4 _test_wrap.c
gcc -Wall  -shared -o _c_test.so _test_wrap.o _test.o
chcon -t textrel_shlib_t _c_test.so
[root/lspp_test_r/SystemLow@kipper test]# ./test.py
The following works:

Result:  0
Error:  0
Syscall:  117
Arg[0]:  3 1
Arg[1]:  32769 1
Arg[2]:  0 1
Arg[3]:  257 1
In local process_native_result
FLAG:  0
RECREATE:  False
OP_PERMITTED
FLAG:  101
DESC:  Passed
Audit arg setup
Before gen_audit_sys_record_tuple 0
After gen_audit_sys_record_tuple for success

The following doesn't work:

Result:  0
Error:  0
Syscall:  117
Arg[0]:  3 1
Arg[1]:  32769 1
Arg[2]:  0 1
Arg[3]:  258 1
In local process_native_result
FLAG:  0
OP_PERMITTED
Audit arg setup
Before gen_audit_sys_record_tuple 0
After gen_audit_sys_record_tuple for success


Comment 26 Miloslav Trmač 2007-04-11 00:28:19 UTC
Created attachment 152188 [details]
Fix semctl usage

The test case is invalid.  As you can see in (man 3p semctl), the fourth
argument of semctl() must be an "union semun".	The testcase passes "buf", a
pointer, instead.

On many architectures both calls compile to the same code, but not on ppc.  On
ppc, structs and unions are passed by "implicit reference", as a pointer to the
struct/union.  So, the "buf" pointer is interpreted as a "union semun *", which
is dereferenced to read "((union semun *)buf)->buf", the first word of "buf",  
as a pointer to "struct semid_ds *".  Because buf is uninitialized, the word
contains some value left on the stack from previous functions calls; the value
happened to point to internal Python data structures, corrupting them.

This patch fixes the semctl () calls.

Comment 27 Miloslav Trmač 2007-04-11 00:31:02 UTC
Created attachment 152189 [details]
Don't reference uninitialized variables in the Python code

Fixing the C code reveals further bugs in the Python module.  After applying
this patch, the test doesn't crash any more and completes successfully.

Comment 28 Klaus Kiwi (Old account no longer used) 2007-04-11 16:20:48 UTC
Miloslav, thank you for showing us the misuse in our testcase. Everything seems
fine now!
 Klaus


Note You need to log in before you can comment on or make changes to this bug.