Python segfaults when running python-jinja2 test suite - see http://s390.koji.fedoraproject.org/koji/taskinfo?taskID=198496 for the logs. Version-Release number of selected component (if applicable): python-2.7-7.fc14.s390x PS: I don't expect a lot of activity here as it's s390x specific, but I should be able to provide an environment to debug this issue. The main purpose of this bug is to track the issue.
Thanks for filing this; are you able to provide a backtrace? (with debuginfo installed please!) Note that: # gdb --args python ARGS FOR PYTHON PATH_TO_SCRIPT ARGS FOR SCRIPT is a convenient way of invoking arbitrary python scripts under gdb
Created attachment 441039 [details] backtrace backtrace generated with "gdb --args python ./setup.py test"
In frame #6 it's claiming RAM for a new reference-owning object, and it happens that the run-every-700 (IIRC) allocations garbage collector is invoked. In frame #1 the garbage collector is visiting a PyTracebackObject, and it looks like one of the fields of that object is invalid (a pointer has value 0xffffffffffffff) I'd need access to the machine to be able to debug this further.
On the machine, I was able to reproduce it within gdb: (gdb) up #1 0x0000020000169372 in tb_traverse (tb=0x8087ef48, visit=0x20000174c2c <visit_decref>, arg=0x0) at /usr/src/debug/Python-2.7/Python/traceback.c:36 36 Py_VISIT(tb->tb_next); (gdb) p tb $1 = (PyTracebackObject *) 0x8087ef48 (gdb) p *tb $2 = {ob_refcnt = 2, ob_type = 0x20000220ef0, tb_next = 0xffffffffffffff, tb_frame = 0xff0000008095dc48, tb_lasti = 147, tb_lineno = 459} tb->tb_next and tb->tb_frame are both invalid tb->ob_type is correct (tp_name is "traceback"), and indeed equals &PyTraceBack_Type (gdb) p &PyTraceBack_Type $6 = (PyTypeObject *) 0x20000220ef0 I attempted to use a hardware watchpoint to see how the fields were manipulated. (gdb) watch ((PyTracebackObject *) 0x8087ef48)->tb_next and tried to restrict the watchpoint to just when the memory is a traceback object: (gdb) condition 1 ((PyTracebackObject*)0x8087ef48)->ob_type == 0x20000220ef0 (gdb) display *((PyTracebackObject*)0x8087ef48) This identified the problem: jinja2 is using ctypes. Specifically, Jinja2-2.5.2/jinja2/debug.py:_init_ugly_crap attempts to use ctypes to directly manipulate C-level fields of the tracebacks, embedding numerous assumptions about the in-memory representation of CPython's tracebacks. There are any number of ways that this can go wrong; it may be that it's getting the 32 vs 64 bit determination wrong, or perhaps something else. The specific backtrace of where the tb_next field become corrupted is here (it's trying to directly manipulate the reference count of the object, but misses and hits the tb_next field): #0 l_set (ptr=0x8087ef59, value=<value optimized out>, size=<value optimized out>) at /usr/src/debug/Python-2.7/Modules/_ctypes/cfield.c:813 #1 0x00000200068fa710 in PyCData_set (dst=<_Traceback at remote 0x808faa48>, type=<_ctypes.PyCSimpleType at remote 0x808a2f68>, setfunc=0, value=-1, index=2, size=8, ptr=0x8087ef59 "\377\377\377\377\377\377\377\377") at /usr/src/debug/Python-2.7/Modules/_ctypes/_ctypes.c:2936 #2 0x00000200068ffd7a in PyCField_set (self=<value optimized out>, inst=<value optimized out>, value=<value optimized out>) at /usr/src/debug/Python-2.7/Modules/_ctypes/cfield.c:214 #3 0x00000200000cfd62 in PyObject_GenericSetAttr (obj=<_Traceback at remote 0x808faa48>, name='ob_refcnt', value=-1) at /usr/src/debug/Python-2.7/Objects/object.c:1492 #4 0x00000200000cf65e in PyObject_SetAttr (v=<_Traceback at remote 0x808faa48>, name='ob_refcnt', value=-1) at /usr/src/debug/Python-2.7/Objects/object.c:1245 #5 0x0000020000138ea4 in PyEval_EvalFrameEx (f= Frame 0x808dd2f8, for file /root/fedora/python-jinja2/Jinja2-2.5.2/jinja2/debug.py, line 289, in tb_set_next (tb=<traceback at remote 0x80907908>, next=None, obj=<_Traceback at remote 0x808fa948>, old=<_Traceback at remote 0x808faa48>), throwflag=<value optimized out>) at /usr/src/debug/Python-2.7/Python/ceval.c:2059 #6 0x000002000013f842 in PyEval_EvalCodeEx (co=0x80888030, globals=<value optimized out>, locals=<value optimized out>, args=<value optimized out>, argcount=<value optimized out>, kws=0x80890758, kwcount=0, defs=0x0, defcount=0, closure= (<cell at remote 0x808835e8>, <cell at remote 0x80883eb8>, <cell at remote 0x808830a8>)) at /usr/src/debug/Python-2.7/Python/ceval.c:3311 #7 0x000002000013d4cc in fast_function (f= Frame 0x808905b8, for file /root/fedora/python-jinja2/Jinja2-2.5.2/jinja2/debug.py, line 36, in _set_tb_next (self=<TracebackFrameProxy(tb=<traceback at remote 0x80907908>) at remote 0x808f5400>, next=None), throwflag=<value optimized out>) at /usr/src/debug/Python-2.7/Python/ceval.c:4167 #8 call_function (f= Frame 0x808905b8, for file /root/fedora/python-jinja2/Jinja2-2.5.2/jinja2/debug.py, line 36, in _set_tb_next (self=<TracebackFrameProxy(tb=<traceback at remote 0x80907908>) at remote 0x808f5400>, next=None), throwflag=<value optimized out>) at /usr/src/debug/Python-2.7/Python/ceval.c:4092 #9 PyEval_EvalFrameEx (f= Frame 0x808905b8, for file /root/fedora/python-jinja2/Jinja2-2.5.2/jinja2/debug.py, line 36, in _set_tb_next (self=<TracebackFrameProxy(tb=<traceback at remote 0x80907908>) at remote 0x808f5400>, next=None), throwflag=<value optimized out>) at /usr/src/debug/Python-2.7/Python/ceval.c:2721
Printing _PyObject._fields_ within _init_ugly_crap gives this output: [('_ob_next', <class 'jinja2.debug.LP__PyObject'>), ('_ob_prev', <class 'jinja2.debug.LP__PyObject'>), ('ob_refcnt', <class 'ctypes.c_long'>), ('ob_type', <class 'jinja2.debug.LP__PyObject'>)] Jinja is mistakenly deciding that this is a debug build of python, whereas it's an optimized build, which doesn't have the _ob_next or _ob_prev fields. Hence when Jinja attempts to manipulate ob_refcnt, it pokes at addresses that are (sizeof(PyObject*) * 2) bytes higher up in RAM than they ought to be, corrupting the later fields.
Adding some more debug prints to jinja indicates: object.__basicsize__: 0 ctypes.sizeof(_PyObject): 16 hence they're non-equal, and the conditional "if object.__basicsize__ != ctypes.sizeof(_PyObject):" is executed, and it treats it erroneously as a --with-pydebug build of Python.
Not yet sure why object.__basicsize__ == 0, I'd expect it to be 16. If I change that conditional to "if False:", then the jinja selftest suite runs to completion.
Looks like an int vs Py_ssize_t issue. struct _typeobject declares tp_basicsize and tp_itemsize as Py_ssize_t: { ... Py_ssize_t tp_basicsize, tp_itemsize; /* For allocation */ ... } but type_members defines them as T_INT: {"__basicsize__", T_INT, offsetof(PyTypeObject,tp_basicsize),READONLY}, {"__itemsize__", T_INT, offsetof(PyTypeObject, tp_itemsize), READONLY}, Hence when accessing "object.__basicsize__", PyMember_GetOne reads it as a T_INT, which gets it as 0 (incorrect). Accessing it as Py_ssize_t reads it as 16 (correct) (gdb) p *(Py_ssize_t*)addr $9 = 16 (gdb) p *(int*)addr $10 = 0 Looks like a Python bug (albeit seen in what I think is a dubious use of ctypes by Jinja): I believe access of a type's __basicsize__ __itemsize__ fields will return 0 on 64-bit bigendian machines. Looking at upstream sources, seems to still affect typeobject.c in latest upstream 2.7 still uses T_INT; and object.h still uses Py_ssize_t for these values. Will see if I can create a minimal reproducer. In the meantime, see comment #7 for a way to patch Jinja to build
I've filed a bug about this: http://bugs.python.org/issue9688 and attached a patch there.
That is an awesome bug report guys. Major props for diving into that. Now obviously the reason why Jinja2 is doing that is to produce proper tracebacks for generated code because Python lacks an interface for that. I think a proper solution for redhat would be to use the switch that opts-in the c implementation of that function (new in 2.5.2, previously that switch was --with-speedups which is still available but gives a deprecation warning): python setup.py --with-debugsupport install I am currently trying to figure out a more reliable way to detect debug builds that does not involve basicsize. Once I have a proper solution I send you a patch. In any way I will do a 2.5.3 release that fixes that.
(In reply to comment #10) > I am currently trying to figure out a more reliable way to detect debug builds > that does not involve basicsize. Once I have a proper solution I send you a > patch. In any way I will do a 2.5.3 release that fixes that. From Python/sysmodule.c: #ifdef Py_TRACE_REFS {"getobjects", _Py_GetObjects, METH_VARARGS}, #endif So hasattr(sys, 'getobjects') will only be true on CPython builds that had Py_TRACE_REFS, which IIRC is the condition you're looking for (presence of _ob_next/_ob_prev ptrs). (Obviously the better fix is for CPython to somehow expose more of the PyTracebackObject to .py code)
This message is a notice that Fedora 14 is now at end of life. Fedora has stopped maintaining and issuing updates for Fedora 14. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At this time, all open bugs with a Fedora 'version' of '14' have been closed as WONTFIX. (Please note: Our normal process is to give advanced warning of this occurring, but we forgot to do that. A thousand apologies.) Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, feel free to reopen this bug and simply change the 'version' to a later Fedora version. Bug Reporter: Thank you for reporting this issue and we are sorry that we were unable to fix it before Fedora 14 reached end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged to click on "Clone This Bug" (top right of this page) and open it against that version of Fedora. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping