Bug 1566522

Summary: python plugin long running close() callback causes crash if nbdkit exits at the same time
Product: [Fedora] Fedora Reporter: Richard W.M. Jones <rjones>
Component: nbdkitAssignee: Richard W.M. Jones <rjones>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 28CC: bthurber, rjones
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-17 18:07:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Richard W.M. Jones 2018-04-12 13:03:22 UTC
Description of problem:

This is similar to bug 1566516 but I believe the root cause is
different.

To reproduce it, use this trivial plugin (same as the other bug).

-------------------------------
#!/usr/sbin/nbdkit python3
import time
def open(readonly):
    h = {}
    return h
def close(h):
    # take some time
    time.sleep(10)
def get_size(h):
    return 1024
def pread(h, count, offset):
    buf = bytearray(count)
    return buf
-------------------------------

$ ./test.py
$ telnet localhost 10809
^]
$ killall nbdkit

The telnet command opens a connection and immediately closes
it.  The close() method takes 10 seconds to run, so it is still
running when we kill nbdkit.

The result is the stack trace at end.

The unload prevention lock is supposed to prevent this, but does not.

Version-Release number of selected component (if applicable):

nbdkit-1.2.1-1.fc28.x86_64

Additional info:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  wait_for_thread_shutdown ()
    at /usr/src/debug/python3-3.6.5-1.fc28.x86_64/Python/pylifecycle.c:1681
1681	    PyObject *threading = PyMapping_GetItemString(tstate->interp->modules,
[Current thread is 1 (Thread 0x7f1b607ab400 (LWP 6007))]
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.27-6.fc28.x86_64 libgcc-8.0.1-0.19.fc28.x86_64 libidn2-2.0.4-3.fc28.x86_64 libselinux-2.7-11.fc28.x86_64 pcre2-10.31-3.fc28.x86_64
(gdb) t a a bt

Thread 2 (Thread 0x7f1b50b18700 (LWP 6023)):
#0  0x00007f1b5f729d27 in select () from /lib64/libc.so.6
#1  0x00007f1b5d9f43ad in pysleep (secs=<optimized out>)
    at /usr/src/debug/python3-3.6.5-1.fc28.x86_64/Modules/timemodule.c:1417
#2  time_sleep ()
    at /usr/src/debug/python3-3.6.5-1.fc28.x86_64/Modules/timemodule.c:235
#3  0x00007f1b5d987f51 in _PyCFunction_FastCallDict ()
    at /usr/src/debug/python3-3.6.5-1.fc28.x86_64/Objects/methodobject.c:209
#4  0x00007f1b5d999f8a in _PyCFunction_FastCallKeywords ()
    at /usr/src/debug/python3-3.6.5-1.fc28.x86_64/Objects/methodobject.c:275
#5  0x00007f1b5d9fd27d in call_function ()
    at /usr/src/debug/python3-3.6.5-1.fc28.x86_64/Python/ceval.c:4824
#6  0x00007f1b5da4208a in _PyEval_EvalFrameDefault ()
    at /usr/src/debug/python3-3.6.5-1.fc28.x86_64/Python/ceval.c:3322
#7  0x00007f1b5da28ca8 in _PyEval_EvalCodeWithName (qualname=0x0, 
    name=<optimized out>, closure=0x0, kwdefs=0x0, defcount=0, defs=0x0, 
    kwstep=2, kwcount=<optimized out>, kwargs=<optimized out>, 
    kwnames=<optimized out>, argcount=<optimized out>, args=<optimized out>, 
    locals=0x0, globals=<optimized out>, _co=<optimized out>)
    at /usr/src/debug/python3-3.6.5-1.fc28.x86_64/Python/ceval.c:4174
#8  PyEval_EvalCodeEx ()
    at /usr/src/debug/python3-3.6.5-1.fc28.x86_64/Python/ceval.c:4174
#9  0x00007f1b5da29c33 in function_call ()
    at /usr/src/debug/python3-3.6.5-1.fc28.x86_64/Objects/funcobject.c:604
#10 0x00007f1b5d98caa3 in PyObject_Call ()
    at /usr/src/debug/python3-3.6.5-1.fc28.x86_64/Objects/abstract.c:2261
#11 0x00007f1b5d9bae58 in PyEval_CallObjectWithKeywords ()
    at /usr/src/debug/python3-3.6.5-1.fc28.x86_64/Python/ceval.c:4744
#12 0x00007f1b5d9baeb7 in PyObject_CallObject (o=<optimized out>, 
    a=a@entry=({},))
    at /usr/src/debug/python3-3.6.5-1.fc28.x86_64/Objects/abstract.c:2187
#13 0x00007f1b5dde7b77 in py_close (handle=0x7f1b606e1750) at python.c:340
#14 0x000000000040a498 in plugin_close (b=<optimized out>, conn=0x7f1b4c000b80)
    at plugins.c:283
#15 0x0000000000406aa2 in free_connection (conn=0x7f1b4c000b80)
    at connections.c:378
#16 _handle_single_connection (sockout=<optimized out>, sockin=<optimized out>)
    at connections.c:316
#17 handle_single_connection (sockin=<optimized out>, sockout=<optimized out>)
    at connections.c:326
#18 0x000000000040a939 in start_thread (datav=0x149a610) at sockets.c:262
#19 0x00007f1b5f9fe564 in start_thread () from /lib64/libpthread.so.0
#20 0x00007f1b5f73233f in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7f1b607ab400 (LWP 6007)):
#0  wait_for_thread_shutdown ()
    at /usr/src/debug/python3-3.6.5-1.fc28.x86_64/Python/pylifecycle.c:1681
#1  0x00007f1b5daaa427 in Py_FinalizeEx ()
    at /usr/src/debug/python3-3.6.5-1.fc28.x86_64/Python/pylifecycle.c:743
#2  0x0000000000409a70 in plugin_free (b=0x14599f0) at plugins.c:75
#3  0x0000000000404e5b in main () at main.c:600
#4  0x00007f1b5f65b1bb in __libc_start_main () from /lib64/libc.so.6
#5  0x000000000040529a in _start () at connections.c:984

Comment 1 Richard W.M. Jones 2018-04-17 18:07:26 UTC
I'm going to mark the two bugs as duplicates because although
they do have a slightly different root cause, they can be fixed
by a single patch series.

*** This bug has been marked as a duplicate of bug 1566516 ***