Bug 482420 - vdsm sometimes segfaults due to m2crypto
Summary: vdsm sometimes segfaults due to m2crypto
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Hypervisor
Classification: Retired
Component: vdsm
Version: 5.4-2.1
Hardware: All
OS: Linux
medium
medium
Target Milestone: beta
Assignee: Dan Kenigsberg
QA Contact: Dan Kenigsberg
URL: http://mantis.tlv.redhat.com/view.php...
Whiteboard:
Depends On:
Blocks: 1026556
TreeView+ depends on / blocked
 
Reported: 2009-01-27 18:31 UTC by Dan Kenigsberg
Modified: 2018-12-15 17:42 UTC (History)
8 users (show)

Fixed In Version: vdsm-4.4-28945 sp130, rhevh 5.5-2.2
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1026556 (view as bug list)
Environment:
Last Closed: 2009-12-02 15:13:46 UTC
Embargoed:


Attachments (Terms of Use)

Description Red Hat Bugzilla 2009-01-27 18:31:14 UTC


---- Reported by dkenigsb 2008-09-23 13:45:03 EDT ----

on various occasions, within m2crypto calls, vdsm dies on signal 11 and dumps its core.Additional Information: one such backtrace:

#0  0x000000334942f8e6 in SSL_set_session () from /lib64/libssl.so.6
#1  0x000000334941f3f3 in ssl3_send_alert () from /lib64/libssl.so.6
#2  0x0000003349420751 in ssl3_read_bytes () from /lib64/libssl.so.6
#3  0x0000003349420e84 in ssl3_get_message () from /lib64/libssl.so.6
#4  0x000000334942143f in ssl3_get_finished () from /lib64/libssl.so.6
#5  0x0000003349419852 in ssl3_accept () from /lib64/libssl.so.6
#6  0x00002b3e41eb6f27 in ssl_accept ()
   from /usr/lib64/python2.4/site-packages/M2Crypto/__m2crypto.so
#7  0x00002b3e41ec92b7 in threading_locking_callback ()
   from /usr/lib64/python2.4/site-packages/M2Crypto/__m2crypto.so
#8  0x00000033440949da in PyEval_EvalFrame ()
   from /usr/lib64/libpython2.4.so.1.0
#9  0x0000003344094486 in PyEval_EvalFrame ()
   from /usr/lib64/libpython2.4.so.1.0
#10 0x0000003344094486 in PyEval_EvalFrame ()
   from /usr/lib64/libpython2.4.so.1.0
#11 0x0000003344094486 in PyEval_EvalFrame ()
   from /usr/lib64/libpython2.4.so.1.0
#12 0x0000003344094486 in PyEval_EvalFrame ()
   from /usr/lib64/libpython2.4.so.1.0
#13 0x0000003344094486 in PyEval_EvalFrame ()
   from /usr/lib64/libpython2.4.so.1.0
#14 0x0000003344095905 in PyEval_EvalCodeEx ()
---Type <return> to continue, or q <return> to quit---
   from /usr/lib64/libpython2.4.so.1.0
#15 0x000000334404c263 in PyClassMethod_New ()
   from /usr/lib64/libpython2.4.so.1.0
#16 0x0000003344035f90 in PyObject_Call () from /usr/lib64/libpython2.4.so.1.0
#17 0x000000334403c01f in PyClass_IsSubclass ()
   from /usr/lib64/libpython2.4.so.1.0
#18 0x0000003344035f90 in PyObject_Call () from /usr/lib64/libpython2.4.so.1.0
#19 0x000000334408f55d in PyEval_CallObjectWithKeywords ()
   from /usr/lib64/libpython2.4.so.1.0
#20 0x00000033440bb33d in initthread () from /usr/lib64/libpython2.4.so.1.0
#21 0x0000003343c061b5 in start_thread () from /lib64/libpthread.so.0
#22 0x00000033430cd36d in clone () from /lib64/libc.so.6
#23 0x0000000000000000 in ?? ()




---- Additional Comments From dkenigsb 2008-10-12 16:32:48 EDT ----

in one case, some other (non-m2crypto) call segfaulted
Core was generated by `/usr/bin/python /opt/vdsm/vdsmd.py /opt/vdsm/vdsm.conf'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000003cc3c61ac1 in ftell () from /lib64/libc.so.6
(gdb) bt
#0  0x0000003cc3c61ac1 in ftell () from /lib64/libc.so.6
#1  0x0000003cc4c4786d in file_tell (f=0x2aaaab8ee4e0) at Objects/fileobject.c:610
#2  0x0000003cc4c947a7 in PyEval_EvalFrame (f=0x24c4fa0) at Python/ceval.c:3547
#3  0x0000003cc4c94486 in PyEval_EvalFrame (f=0x24cde70) at Python/ceval.c:3645
#4  0x0000003cc4c94486 in PyEval_EvalFrame (f=0x24c54d0) at Python/ceval.c:3645
#5  0x0000003cc4c94486 in PyEval_EvalFrame (f=0x24b8760) at Python/ceval.c:3645
#6  0x0000003cc4c94486 in PyEval_EvalFrame (f=0x24b28c0) at Python/ceval.c:3645
#7  0x0000003cc4c94486 in PyEval_EvalFrame (f=0x24cd890) at Python/ceval.c:3645
#8  0x0000003cc4c95905 in PyEval_EvalCodeEx (co=0x2ac650c0fa40, globals=<value optimized out>, locals=<value optimized out>, args=0x1, 
    argcount=4, kws=0x2454360, kwcount=0, defs=0x2ac650c11628, defcount=1, closure=0x0) at Python/ceval.c:2736
#9  0x0000003cc4c4c1f7 in function_call (func=0x2ac650c17500, arg=0x2ac651b1fd60, kw=0x24712b0) at Objects/funcobject.c:548
#10 0x0000003cc4c35fb0 in PyObject_Call (func=0x0, arg=0x1, kw=0x1) at Objects/abstract.c:1795
#11 0x0000003cc4c3c03f in instancemethod_call (func=<value optimized out>, arg=0x2ac651b1fd60, kw=0x24712b0) at Objects/classobject.c:2447
#12 0x0000003cc4c35fb0 in PyObject_Call (func=0x0, arg=0x1, kw=0x1) at Objects/abstract.c:1795
#13 0x0000003cc4c8f55d in PyEval_CallObjectWithKeywords (func=0x2ac651b52d70, arg=0x2ac651b60550, kw=0x24712b0) at Python/ceval.c:3430
#14 0x0000003cc4c8c492 in builtin_apply (self=<value optimized out>, args=<value optimized out>) at Python/bltinmodule.c:100
#15 0x0000003cc4c949da in PyEval_EvalFrame (f=0x249ba00) at Python/ceval.c:3563
#16 0x0000003cc4c95905 in PyEval_EvalCodeEx (co=0x2ac650c0f5e0, globals=<value optimized out>, locals=<value optimized out>, 
    args=0x24aedd8, argcount=2, kws=0x24aede8, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2736
#17 0x0000003cc4c9405f in PyEval_EvalFrame (f=0x24aec00) at Python/ceval.c:3656
#18 0x0000003cc4c94486 in PyEval_EvalFrame (f=0x23960d0) at Python/ceval.c:3645
#19 0x0000003cc4c94486 in PyEval_EvalFrame (f=0x238d000) at Python/ceval.c:3645
#20 0x0000003cc4c95905 in PyEval_EvalCodeEx (co=0x2ac64eeb4960, globals=<value optimized out>, locals=<value optimized out>, 
    args=0x2ac651b32128, argcount=1, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2736
#21 0x0000003cc4c4c259 in function_call (func=0x2ac64eec0140, arg=0x2ac651b32110, kw=0x0) at Objects/funcobject.c:548
#22 0x0000003cc4c35fb0 in PyObject_Call (func=0x0, arg=0x1, kw=0x1) at Objects/abstract.c:1795
#23 0x0000003cc4c3c03f in instancemethod_call (func=<value optimized out>, arg=0x2ac651b32110, kw=0x0) at Objects/classobject.c:2447
#24 0x0000003cc4c35fb0 in PyObject_Call (func=0x0, arg=0x1, kw=0x1) at Objects/abstract.c:1795
#25 0x0000003cc4c8f55d in PyEval_CallObjectWithKeywords (func=0x2ac651b28d70, arg=0x2ac64b629050, kw=0x0) at Python/ceval.c:3430
#26 0x0000003cc4cbb43d in t_bootstrap (boot_raw=0x243c9f0) at Modules/threadmodule.c:434
#27 0x0000003cc4806307 in start_thread () from /lib64/libpthread.so.0
#28 0x0000003cc3cd1ded in clone () from /lib64/libc.so.6



--- Bug imported by bugzilla 2009-01-27 13:34 EDT ---

This bug was previously known as _bug_ 5414 at http://mantis.tlv.redhat.com/show_bug.cgi?id=5414

Actual time not defined. Setting to 0.0



Comment 4 Omri Hochman 2009-03-31 14:55:38 UTC
reproduced with ver. m2crypto-0.16-7.el5ovirt.x86_64.rpm

Comment 5 Alan Pevec 2009-03-31 15:09:03 UTC
(In reply to comment #4)
> reproduced with ver. m2crypto-0.16-7.el5ovirt.x86_64.rpm  

please attach stack trace and/or steps to reproduce

Comment 6 Omri Hochman 2009-04-01 08:27:13 UTC
-sorry please ignore the comment above , related to bug 489296.

Comment 7 Dan Kenigsberg 2009-04-28 14:11:25 UTC
haven't seen or heard about this with new m2crpyto versions. let's see if QE can reproduce this.

Comment 8 Dan Kenigsberg 2009-05-11 08:52:47 UTC
ohochman got segfault and this backtrace with m2crypto-0.16-6.el5.4. I hope mitr (or anyone else) can read into it.

#0  remove_session_lock (ctx=0x1b9dee00, c=0x0, lck=1) at ssl_sess.c:512
#1  0x00002b380324304c in ssl3_send_alert (s=0x1baf1160, level=2, desc=0) at s3_pkt.c:1266
#2  0x00002b3803243f31 in ssl3_read_bytes (s=0x1baf1160, type=0, buf=0x0, len=0, peek=0) at s3_pkt.c:468
#3  0x00002b38032411d4 in ssl3_shutdown (s=0xa) at s3_lib.c:2358
#4  0x00002b380325a902 in ssl_free (a=0x1bb753a0) at bio_ssl.c:127
#5  0x00002b38034e3921 in BIO_free (a=0x1bb753a0) at bio_lib.c:136
#6  0x00002b380605d11b in _wrap_bio_free (self=<value optimized out>, args=<value optimized out>) at SWIG/_m2crypto_wrap.c:6567
#7  0x00002b38012cf97a in PyEval_EvalFrame (f=0x1baf7d50) at Python/ceval.c:3563
#8  0x00002b38012d08a5 in PyEval_EvalCodeEx (co=0x2b38062c7880, globals=<value optimized out>, locals=<value optimized out>, 
    args=0x2aaaaab454a8, argcount=1, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2736
#9  0x00002b3801287279 in function_call (func=0x2b38062d2e60, arg=0x2aaaaab45490, kw=0x0) at Objects/funcobject.c:548
#10 0x00002b3801270fb0 in PyObject_Call (func=0xa, arg=0xc, kw=0x2b380325e0c1) at Objects/abstract.c:1795
#11 0x00002b380127703f in instancemethod_call (func=<value optimized out>, arg=0x2aaaaab45490, kw=0x0) at Objects/classobject.c:2447
#12 0x00002b3801270fb0 in PyObject_Call (func=0xa, arg=0xc, kw=0x2b380325e0c1) at Objects/abstract.c:1795
#13 0x00002b38012ca4fd in PyEval_CallObjectWithKeywords (func=0x2aaaaab5e460, arg=0x2b3801041050, kw=0x0) at Python/ceval.c:3430
#14 0x00002b3801279f3d in instance_dealloc (inst=0x1bb7cc20) at Objects/classobject.c:646
#15 0x00002b38012a734f in tupledealloc (op=0x2b38072ae290) at Objects/tupleobject.c:169
#16 0x00002b38012969ab in dict_dealloc (mp=0x1bbc9590) at Objects/dictobject.c:728
#17 0x00002b38012b0038 in subtype_dealloc (self=0x1bb7aad0) at Objects/typeobject.c:691
#18 0x00002b380127747d in instancemethod_dealloc (im=0x2b3807295280) at Objects/classobject.c:2237
#19 0x00002b38012f644d in t_bootstrap (boot_raw=0x1ba54500) at Modules/threadmodule.c:454
#20 0x00002b3801572367 in start_thread () from /lib64/libpthread.so.0
#21 0x00002b3801ee50ad in clone () from /lib64/libc.so.6

Comment 9 Omri Hochman 2009-05-11 09:03:55 UTC
core available under \\orion.tlv.redhat.com\public\omri\core8321\core.8321.

Comment 10 Miloslav Trmač 2009-05-11 09:34:49 UTC
The backtrace seems to be inconsistent (perhaps the debug info is incorrect because of optimization):

* If the remove_session_lock() parameters are correct, the program can't crash in the function, the function does almost nothing if c is NULL.
* If the location (ssl_sess.c:512) is correct, the program could crash because it found an entry in a hash table, and it didn't find it in an immediately following search for the same entry.

Tomas, do you have any ideas?

Omri, can you put the backtrace somewhere we can access it, or tell us how to access orion.tlv, please?

Comment 11 Tomas Mraz 2009-05-11 10:22:50 UTC
I'd suppose this is caused by some mishandled locking and thus concurrent access to openssl internal data structures from multiple threads. I don't think this is a bug in openssl.

Comment 12 Miloslav Trmač 2009-05-11 10:31:28 UTC
Does vdsm call M2Crypto.threading.init()?

Comment 13 Dan Kenigsberg 2009-05-11 17:17:58 UTC
(In reply to comment #12)
> Does vdsm call M2Crypto.threading.init()?  

quelle horreur, no.

Comment 14 Dan Kenigsberg 2009-09-21 18:42:26 UTC
haven't seen this in ages. setting to VERIFY.


Note You need to log in before you can comment on or make changes to this bug.