Bug 674206 - python ctypes segmentation fault
python ctypes segmentation fault
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: python (Show other bugs)
6.0
ppc64 Linux
high Severity high
: rc
: ---
Assigned To: Dave Malcolm
BaseOS QE - Apps
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2011-01-31 18:50 EST by Jeff Bastian
Modified: 2011-02-07 18:17 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2011-02-07 14:38:03 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
python core dump (1.32 MB, application/x-bzip2)
2011-01-31 18:53 EST, Jeff Bastian
no flags Details

  None (edit)
Description Jeff Bastian 2011-01-31 18:50:46 EST
Description of problem:
Python crashes with a segmentation fault when using the ctypes module on RHEL 6 ppc64.  It works fine on x86_64 and i686, so it appears to be ppc64 specific.


[root@localhost ~]# cat a.py
#!/usr/bin/python

import ctypes
libc = ctypes.cdll.LoadLibrary('libc.so.6')
libc.malloc.restype = ctypes.c_void_p
memarr = libc.malloc(1024)
libc.free(memarr)

[root@localhost ~]# ./a.py 
Segmentation fault (core dumped)

Version-Release number of selected component (if applicable):
python-2.6.5-3.el6.ppc64

How reproducible:
every time

Steps to Reproduce:
1. create a.py as shown above and run it
  
Actual results:
python seg faults

Expected results:
python doesn't crash

Additional info:
Comment 1 Jeff Bastian 2011-01-31 18:53:14 EST
Created attachment 476290 [details]
python core dump

A backtrace from the attached core:

(gdb) bt
#0  __libc_free (mem=0xa5beec0) at malloc.c:3709
#1  0x00000fff7bd784b4 in .ffi_call_LINUX64 () at src/powerpc/linux64.S:103
#2  0x00000fff7bd78390 in ffi_call (cif=0xfffd4ccdc00, 
    fn=<value optimized out>, rvalue=<value optimized out>, avalue=
    0xfffd4ccdb90) at src/powerpc/ffi.c:910
#3  0x00000fff7bdaa16c in _call_function_pointer (pProc=
    @0xfff82a84280: 0xfff82965b80 <__libc_free>, 
    argtuple=<value optimized out>, flags=<value optimized out>, 
    argtypes=<value optimized out>, restype=
    <_ctypes.SimpleType at remote 0x1000a5c8260>, checker=0x0)
    at /usr/src/debug/Python-2.6.5/Modules/_ctypes/callproc.c:816
#4  _CallProc (pProc=@0xfff82a84280: 0xfff82965b80 <__libc_free>, 
    argtuple=<value optimized out>, flags=<value optimized out>, 
    argtypes=<value optimized out>, restype=
    <_ctypes.SimpleType at remote 0x1000a5c8260>, checker=0x0)
    at /usr/src/debug/Python-2.6.5/Modules/_ctypes/callproc.c:1163
#5  0x00000fff7bda0e08 in CFuncPtr_call (self=0xfff82886e20, 
    inargs=<value optimized out>, kwds=0x0)
    at /usr/src/debug/Python-2.6.5/Modules/_ctypes/_ctypes.c:3860
#6  0x00000fff82ccb59c in PyObject_Call (func=
    <_FuncPtr(__name__='free') at remote 0xfff82886e20>, 
    arg=<value optimized out>, kw=<value optimized out>)
    at Objects/abstract.c:2492
#7  0x00000fff82d9cc70 in do_call (f=
    Frame 0x1000a533580, for file ./a.py, line 7, in <module> (), 
    throwflag=<value optimized out>) at Python/ceval.c:3968
#8  call_function (f=
    Frame 0x1000a533580, for file ./a.py, line 7, in <module> (), 
    throwflag=<value optimized out>) at Python/ceval.c:3773
#9  PyEval_EvalFrameEx (f=
    Frame 0x1000a533580, for file ./a.py, line 7, in <module> (), 
    throwflag=<value optimized out>) at Python/ceval.c:2412
#10 0x00000fff82d9fff0 in PyEval_EvalCodeEx (co=0xfff826a23f0, 
    globals=<value optimized out>, locals=<value optimized out>, 
    args=<value optimized out>, argcount=<value optimized out>, 
    kws=<value optimized out>, kwcount=<value optimized out>, defs=0x0, 
    defcount=0, closure=0x0) at Python/ceval.c:3000
#11 0x00000fff82da00c0 in PyEval_EvalCode (co=<value optimized out>, 
    globals=<value optimized out>, locals=<value optimized out>)
    at Python/ceval.c:541
#12 0x00000fff82dc388c in run_mod (mod=<value optimized out>, 
    filename=<value optimized out>, globals=
    {'memarr': 1099685424832, '__builtins__': <module at remote 0xfff82876868>, '__file__': './a.py', '__package__': None, 'libc': <CDLL(_FuncPtr=<_ctypes.CFuncPtrType at remote 0x1000a5c0260>, malloc=<_FuncPtr(__name__='malloc') at remote 0xfff82886d50>, free=<_FuncPtr(__name__='free') at remote 0xfff82886e20>, _handle=17590088062792, _name='libc.so.6') at remote 0xfff826ba690>, 'ctypes': <module at remote 0xfff826b8ad0>, '__name__': '__main__', '__doc__': None}, locals=
    {'memarr': 1099685424832, '__builtins__': <module at remote 0xfff82876868>, '__file__': './a.py', '__package__': None, 'libc': <CDLL(_FuncPtr=<_ctypes.CFuncPtrType at remote 0x1000a5c0260>, malloc=<_FuncPtr(__name__='malloc') at remote 0xfff82886d50>, free=<_FuncPtr(__name__='free') at remote 0xfff82886e20>, _handle=17590088062792, _name='libc.so.6') at remote 0xfff826ba690>, 'ctypes': <module at remote 0xfff826b8ad0>, '__name__': '__main__', '__doc__': None}, 
    flags=<value optimized out>, arena=<value optimized out>)
    at Python/pythonrun.c:1339
#13 0x00000fff82dc39e8 in PyRun_FileExFlags (fp=0x1000a4eb180, filename=
    0xfffd4ccf7ff "./a.py", start=<value optimized out>, globals=
    {'memarr': 1099685424832, '__builtins__': <module at remote 0xfff82876868>, '__file__': './a.py', '__package__': None, 'libc': <CDLL(_FuncPtr=<_ctypes.CFuncPtrType at remote 0x1000a5c0260>, malloc=<_FuncPtr(__name__='malloc') at remote 0xfff82886d50>, free=<_FuncPtr(__name__='free') at remote 0xfff82886e20>, _handle=17590088062792, _name='libc.so.6') at remote 0xfff826ba690>, 'ctypes': <module at remote 0xfff826b8ad0>, '__name__': '__main__', '__doc__': None}, locals=
    {'memarr': 1099685424832, '__builtins__': <module at remote 0xfff82876868>, '__file__': './a.py', '__package__': None, 'libc': <CDLL(_FuncPtr=<_ctypes.CFuncPtrType at remote 0x1000a5c0260>, malloc=<_FuncPtr(__name__='malloc') at remote 0xfff82886d50>, free=<_FuncPtr(__name__='free') at remote 0xfff82886e20>, _handle=17590088062792, _name='libc.so.6') at remote 0xfff826ba690>, 'ctypes': <module at remote 0xfff826b8ad0>, '__name__': '__main__', '__doc__': None}, 
    closeit=<value optimized out>, flags=0xfffd4cce5dc)
    at Python/pythonrun.c:1325
#14 0x00000fff82dc5658 in PyRun_SimpleFileExFlags (fp=0x1000a4eb180, filename=
    0xfffd4ccf7ff "./a.py", closeit=<value optimized out>, flags=0xfffd4cce5dc)
    at Python/pythonrun.c:935
#15 0x00000fff82dc61d4 in PyRun_AnyFileExFlags (fp=0x1000a4eb180, filename=
    0xfffd4ccf7ff "./a.py", closeit=<value optimized out>, flags=0xfffd4cce5dc)
    at Python/pythonrun.c:739
#16 0x00000fff82dd62f8 in Py_Main (argc=<value optimized out>, 
    argv=<value optimized out>) at Modules/main.c:572
#17 0x00000000100007d0 in main (argc=<value optimized out>, 
    argv=<value optimized out>) at Modules/python.c:23
Comment 2 Jeff Bastian 2011-01-31 19:16:43 EST
Side-note: the test program works okay for me with Fedora 12 on a 32-bit ppc (an old Mac Mini G4) and python-2.6.2-8.fc12.ppc
Comment 3 Jeff Bastian 2011-02-01 19:36:04 EST
I tried casting memarr as a c_void_p before calling libc.free and now it works okay on ppc64:

[root@ibm-js22-vios-02-lp1 tmp]# rpm -q python
python-2.6.5-3.el6.ppc64

[root@ibm-js22-vios-02-lp1 tmp]# diff a.py b.py
7c7
< libc.free(memarr)
---
> libc.free(ctypes.cast(memarr, ctypes.c_void_p))

[root@ibm-js22-vios-02-lp1 tmp]# ./a.py
Segmentation fault (core dumped)

[root@ibm-js22-vios-02-lp1 tmp]# ./b.py
[root@ibm-js22-vios-02-lp1 tmp]#


Should Python be able to gracefully handle this?  Given the unique nature of ctypes's functionality, I'm not sure a seg fault can be avoided here.
Comment 4 Dave Malcolm 2011-02-01 20:14:42 EST
I was only able to reproduce this on my ppc64 RHEL6 box by upping the allocation somewhat, to 1024000 bytes.

Having done that, I was able to fix the crash on ppc64 by adding these lines:
 import ctypes
 libc = ctypes.cdll.LoadLibrary('libc.so.6')
 libc.malloc.restype = ctypes.c_void_p
+libc.malloc.argtypes = [ctypes.c_size_t]
+libc.free.argtypes = [ctypes.c_void_p]
 memarr = libc.malloc(1024000)
 libc.free(memarr)

Does a change like this fix things for the customer?

I strongly recommend manually supplying the "argtypes" attribute for functions:
  http://docs.python.org/release/2.6.6/library/ctypes.html#specifying-the-required argument-types-function-prototypes

and ensuring that the argument types correspond to those from the prototype in the C header files, based on the table here:
  http://docs.python.org/release/2.6.6/library/ctypes.html#fundamental-data-types

Unfortunately, ctypes doesn't have any type-safety mechanism, beyond "argtypes" (and getting the "argtypes" even slightly wrong  can also crash python processes, unfortunately).

Unless "argtypes" is set, ctypes does not "know" what the types of the imported C functions are - it can only make a guess, based on the Python types of the supplied arguments.

By default, ctypes treats a Python argument of type "int" as being of C type "signed int", whereas malloc expects a size_t.

From http://docs.python.org/library/ctypes.html
"Python integers and Python longs are passed as the platforms default C int type, their value is masked to fit into the C type."

I did some debugging to try to confirm this diagnosis:

The returned object is of Python type "int"
>>> type(memarr)
<type 'int'>

which means that the call to "free" is called with a "signed int", rather than a "void *".

The ob_ival of a PyIntObject is 8 bytes.  However, the default arg handling with ctypes (within the call to "free") for a python "int" is ffi_type_sint, and my reading of <ffi.h> is that on ppc64, this is ffi_type_sint32, and thus the pointer is truncated to a 4-byte signed int; the call to "free" thus receives a corrupted 64-bit value, and glibc's attempts to update the heap makes it write to that non-area of memory, leading to the segmentation fault.

Hope this is helpful
Comment 5 Ben 2011-02-02 18:42:29 EST
Comment from IBM:

------- Comment From caio@linux.vnet.ibm.com 2011-02-02 18:20 EDT-------
Thanks for the thorough information. We had a similar workaround for an issue posted regarding python-augeas (ID 16731, External Bug ID 00404938), but yours if definitely better.

However, the reason for this open bug is the different behavior of this simple ctypes usage when compared to other distros and arches. I've verified that on Python 2.6 (r26:66714, May  5 2010, 22:50:28) on PPC64 it works without crashing the python interpreter.

The question is: Something changed internally from the working revision to the one on RHEL6 (Python 2.6.5 (r265:79063, Jul 14 2010, 14:12:47)) that now makes this crash. Was this expected? Will this be permanent?

I.e.: should every upstream package that uses ctypes (and supports ppc64) and relies on this assumption that the implicit conversion from Python Integer to void* (back and forth) works update their code in order to avoid this segfault?

Thank you!

Note You need to log in before you can comment on or make changes to this bug.