Bug 462807 - koji dumps core in todays rawhide
koji dumps core in todays rawhide
Status: CLOSED DUPLICATE of bug 462671
Product: Fedora
Classification: Fedora
Component: pyOpenSSL (Show other bugs)
rawhide
All Linux
medium Severity medium
: ---
: ---
Assigned To: Paul F. Johnson
Fedora Extras Quality Assurance
:
Depends On:
Blocks: F10Blocker/F10FinalBlocker
  Show dependency treegraph
 
Reported: 2008-09-18 21:27 EDT by Matthias Clasen
Modified: 2008-09-23 19:31 EDT (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-09-23 19:31:42 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
correct threadsafe patch (3.38 KB, patch)
2008-09-19 18:34 EDT, Michael Schwendt
no flags Details | Diff

  None (edit)
Description Matthias Clasen 2008-09-18 21:27:12 EDT
[mclasen@localhost devel]$ make build
/usr/bin/koji  build  dist-f10 'cvs://cvs.fedoraproject.org/cvs/pkgs?rpms/librsvg2/devel#librsvg2-2_22_2-2_fc10'
ERROR: ctx->tstate == NULL!
Fatal Python error: PyEval_RestoreThread: NULL tstate
make: *** [koji] Aborted (core dumped)


This is with 

[mclasen@localhost devel]$ rpm -q koji python
koji-1.2.6-1.fc10.noarch
python-2.5.1-30.fc10.i386
Comment 1 Mike McLean 2008-09-19 10:32:31 EDT
This is almost certainly a python bug.
Comment 2 James Antill 2008-09-19 10:59:59 EDT
 I assume we load shared libraries into python for koji ... at which point I'd say the bug is much more likely to be in one of them.
Comment 3 Mike McLean 2008-09-19 11:52:15 EDT
Good point.

Matthias - I can't seem to replicate this on a rawhide box here. If this replicates for you, can you help to pin it down? Is it any koji command or just koji build? See if 'koji call echo foo' triggers it.

Run the problematic koji command line under gdb and get a backtrace.
Comment 4 Matthias Clasen 2008-09-19 12:55:04 EDT
koji call echo foo  reproduces it too.
Here is a stacktrace from doing gdb --args python /usr/bin/koji call echo foo:

Thread 1 (Thread 0xb7fe46c0 (LWP 29117)):
#0  0x00110416 in __kernel_vsyscall ()
#1  0x0087c740 in raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#2  0x0087e108 in abort () at abort.c:88
#3  0x02490bb0 in Py_FatalError () from /usr/lib/libpython2.5.so.1.0
#4  0x02475fb6 in PyEval_RestoreThread () from /usr/lib/libpython2.5.so.1.0
#5  0x00282860 in ?? () from /usr/lib/python2.5/site-packages/OpenSSL/SSL.so
#6  0x076894ad in ?? () from /lib/libcrypto.so.7
#7  0x07689dea in X509_verify_cert () from /lib/libcrypto.so.7
#8  0x0781ddbe in ssl_verify_cert_chain () from /lib/libssl.so.7
#9  0x07804a32 in ssl3_get_server_certificate () from /lib/libssl.so.7
#10 0x07806a85 in ssl3_connect () from /lib/libssl.so.7
#11 0x07809d56 in ssl3_write_bytes () from /lib/libssl.so.7
#12 0x07806f4a in ssl3_write () from /lib/libssl.so.7
#13 0x07819aa9 in SSL_write () from /lib/libssl.so.7
#14 0x00281a61 in SSLv23_method () from /usr/lib/python2.5/site-packages/OpenSSL/SSL.so
#15 0x0242455d in PyCFunction_Call () from /usr/lib/libpython2.5.so.1.0
#16 0x02474734 in PyEval_EvalFrameEx () from /usr/lib/libpython2.5.so.1.0
#17 0x02475ca5 in PyEval_EvalCodeEx () from /usr/lib/libpython2.5.so.1.0
#18 0x02474381 in PyEval_EvalFrameEx () from /usr/lib/libpython2.5.so.1.0
#19 0x02475567 in PyEval_EvalFrameEx () from /usr/lib/libpython2.5.so.1.0
#20 0x02475567 in PyEval_EvalFrameEx () from /usr/lib/libpython2.5.so.1.0
#21 0x02475567 in PyEval_EvalFrameEx () from /usr/lib/libpython2.5.so.1.0
#22 0x02475567 in PyEval_EvalFrameEx () from /usr/lib/libpython2.5.so.1.0
#23 0x02475ca5 in PyEval_EvalCodeEx () from /usr/lib/libpython2.5.so.1.0
#24 0x02474381 in PyEval_EvalFrameEx () from /usr/lib/libpython2.5.so.1.0
#25 0x02475567 in PyEval_EvalFrameEx () from /usr/lib/libpython2.5.so.1.0
#26 0x02475ca5 in PyEval_EvalCodeEx () from /usr/lib/libpython2.5.so.1.0
#27 0x0240feee in ?? () from /usr/lib/libpython2.5.so.1.0
#28 0x023f00a7 in PyObject_Call () from /usr/lib/libpython2.5.so.1.0
#29 0x023f74f8 in ?? () from /usr/lib/libpython2.5.so.1.0
#30 0x023f00a7 in PyObject_Call () from /usr/lib/libpython2.5.so.1.0
#31 0x023f8116 in ?? () from /usr/lib/libpython2.5.so.1.0
#32 0x023f00a7 in PyObject_Call () from /usr/lib/libpython2.5.so.1.0
#33 0x02472db7 in PyEval_EvalFrameEx () from /usr/lib/libpython2.5.so.1.0
#34 0x02475567 in PyEval_EvalFrameEx () from /usr/lib/libpython2.5.so.1.0
#35 0x02475ca5 in PyEval_EvalCodeEx () from /usr/lib/libpython2.5.so.1.0
#36 0x02474381 in PyEval_EvalFrameEx () from /usr/lib/libpython2.5.so.1.0
#37 0x02475ca5 in PyEval_EvalCodeEx () from /usr/lib/libpython2.5.so.1.0
#38 0x02474381 in PyEval_EvalFrameEx () from /usr/lib/libpython2.5.so.1.0
#39 0x02475567 in PyEval_EvalFrameEx () from /usr/lib/libpython2.5.so.1.0
#40 0x02475ca5 in PyEval_EvalCodeEx () from /usr/lib/libpython2.5.so.1.0
#41 0x0240feee in ?? () from /usr/lib/libpython2.5.so.1.0
#42 0x02439fad in ?? () from /usr/lib/libpython2.5.so.1.0
#43 0x02400520 in ?? () from /usr/lib/libpython2.5.so.1.0
#44 0x023f00a7 in PyObject_Call () from /usr/lib/libpython2.5.so.1.0
#45 0x024703fe in PyEval_EvalFrameEx () from /usr/lib/libpython2.5.so.1.0
#46 0x02475ca5 in PyEval_EvalCodeEx () from /usr/lib/libpython2.5.so.1.0
#47 0x02475f03 in PyEval_EvalCode () from /usr/lib/libpython2.5.so.1.0
#48 0x02491562 in ?? () from /usr/lib/libpython2.5.so.1.0
#49 0x02491622 in PyRun_FileExFlags () from /usr/lib/libpython2.5.so.1.0
#50 0x02492dac in PyRun_SimpleFileExFlags () from /usr/lib/libpython2.5.so.1.0
#51 0x0249351a in PyRun_AnyFileExFlags () from /usr/lib/libpython2.5.so.1.0
#52 0x0249d60f in Py_Main () from /usr/lib/libpython2.5.so.1.0
#53 0x080485d2 in main ()


Looks like something between python and openssl

This is with

[mclasen@localhost devel]$ rpm -q python openssl
python-2.5.1-30.fc10.i386
openssl-0.9.8g-11.fc10.i686
Comment 5 Matthias Clasen 2008-09-19 12:57:05 EDT
And 

pyOpenSSL-0.7-1.fc10.i386
Comment 6 Dennis Gilmore 2008-09-19 13:34:38 EDT
i have pyOpenSSL-0.6-4.fc9.x86_64  on my rawhide box.  maybe koji needs a rebuild against the new pyOpenSSL.
Comment 7 Toshio Ernie Kuratomi 2008-09-19 16:44:59 EDT
Dennis tells me that koji client isn't generally threaded, but this pyopenssl bug looks like it may have some bearing::

https://bugs.launchpad.net/pyopenssl/+bug/262381

Apparently plague is heavily threaded so testing a build with this pyopenssl is likely to show a whole lot of breakage.  Dennis also tells me that some of the SSL check code in koji client was ported from plague so maybe it is multithreaded at that point?
Comment 8 Toshio Ernie Kuratomi 2008-09-19 17:08:47 EDT
Looking a tad bit further, it looks like the error message is being generated by a local patch that we're carrying::

  pyOpenSSL-0.7-threadsafe.patch

This looks like a modification by Paul of the original patch by Dan Williams.

Old patch:

http://cvs.fedoraproject.org/viewvc/devel/pyOpenSSL/pyOpenSSL-threadsafe.patch?revision=1.2&pathrev=pyOpenSSL-0_6-4_fc9

Replaced by these two new patches:

http://cvs.fedoraproject.org/viewvc/devel/pyOpenSSL/pyOpenSSL-threadsafe.patch?hideattic=0&revision=1.3&view=markup

http://cvs.fedoraproject.org/viewvc/devel/pyOpenSSL/pyOpenSSL-0.7-threadsafe.patch?hideattic=0&revision=1.1&view=markup

Paul, Dan, any idea what's going on here?
Comment 9 Mike McLean 2008-09-19 17:25:09 EDT
Dennis(In reply to comment #6)
> maybe koji needs a rebuild against the new pyOpenSSL.

koji is all python, so it shouldn't need a rebuild. If there was an API change, then perhaps we need to make a code adjustment, but even so, the lib should core dump over something like that.

Maybe a python wrapper module needs to be rebuilt against the base library.
Comment 10 Paul F. Johnson 2008-09-19 17:26:03 EDT
The pyOpenSSL-threadsafe is the original version but with the material referenced in the 0.7-threadsafe removed (the 0.7-threadsafe was different to the original patch in line numbers mostly but failed to build here).

More or less if you put the two patches together, you'll get the original one back.
Comment 11 Mike McLean 2008-09-19 17:37:08 EDT
(In reply to comment #7)
> Dennis tells me that koji client isn't generally threaded, but this pyopenssl
> bug looks like it may have some bearing::

koji is not threaded. However, the hub code attempts to be thread-safe just in case it is run inside a threaded httpd server.

There is some threaded test code in ssl/XMLRPCServerProxy.py, but that should not be executed under normal circumstances.

SocketServer.ThreadingTCPServer is subclassed in ssl/SSLCommon.py, but we don't seem to ever instantiate that class. This lib was pulled from plague I believe.

So in summary, I don't believe there is any inherent threading in the koji cli code.

However, the python interpreter itself (or external libraries) may be using threads underneath us.
Comment 12 Mike McLean 2008-09-19 17:41:34 EDT
Matthias, out of curiousity...

Does 'koji --noauth call echo foo' sidestep the bug?

How about 'koji --noauth --server http://koji.fedoraproject.org/kojihub call echo foo'  (e.g. explicit non-https server url).
Comment 13 Mike McLean 2008-09-19 18:00:33 EDT
Works with pyOpenSSL-0.6-4.fc9, fails with pyOpenSSL-0.7-1.fc10

[mike@ridley ~]$ rpm -q koji python openssl pyOpenSSL
koji-1.2.6-1.fc10.noarch
python-2.5.1-30.fc10.ppc
openssl-0.9.8g-11.fc10.ppc
pyOpenSSL-0.6-4.fc9.ppc
[mike@ridley ~]$ koji call echo foo
['foo']

[root@ridley ~]# rpm -Uvh pyOpenSSL-0.7-1.fc10.ppc.rpm 

[mike@ridley ~]$ rpm -q koji python openssl pyOpenSSL
koji-1.2.6-1.fc10.noarch
python-2.5.1-30.fc10.ppc
openssl-0.9.8g-11.fc10.ppc
pyOpenSSL-0.7-1.fc10.ppc
[mike@ridley ~]$ koji call echo foo
ERROR: ctx->tstate == NULL!
Fatal Python error: PyEval_RestoreThread: NULL tstate
Aborted

Also note this verifies the bug is on ppc as well as i386.
Comment 14 Mike McLean 2008-09-19 18:01:58 EDT
(oh and --noauth does sidestep the bug). Of course, you can't build with --noauth.
Comment 15 Michael Schwendt 2008-09-19 18:34:07 EDT
Created attachment 317264 [details]
correct threadsafe patch

This is the patch ported against pyOpenSSL 0.7. Works for me
with Plague and very brief testing.

BEFORE:

Using database engine mysql.
ERROR: ctx->tstate == NULL!
Fatal Python error: PyEval_RestoreThread: NULL tstate
Aborted
Comment 16 Michael Schwendt 2008-09-19 19:41:23 EDT
Also works with koji for me.
Comment 17 Dennis Gilmore 2008-09-22 15:12:33 EDT
I merged the two threadsafe patches together  to make one and built pyOpenSSL-0.7-2.fc10  in my testing it works fine.  We should make sure that it heads upstream.
Comment 18 Toshio Kuratomi 2008-09-22 18:07:00 EDT
Note: I found a bug upstream that partially matched our symptoms and reported our problem along with our updated patches.  Upstream says they've fixed this slightly differently in cvs.  But there are still some issues with threadsafety even after that/with our patches::

  https://bugs.launchpad.net/pyopenssl/+bug/262381

He also says he's considering getting an 0.8 release out that addresses at least part of the thread safety issue since people (like us :-) are running into threading issues in the wild.
Comment 19 Matthias Clasen 2008-09-22 18:25:33 EDT
Just wanted to say that koji works for me again, thanks.
Comment 20 Dennis Gilmore 2008-09-23 19:31:42 EDT

*** This bug has been marked as a duplicate of bug 462671 ***

Note You need to log in before you can comment on or make changes to this bug.