Bug 1816107

Summary: [abrt] python3: _PyObject_GC_UNTRACK_impl(): python3.8 killed by SIGSEGV
Product: [Fedora] Fedora Reporter: Ovidio <ovidiopirvu>
Component: python-cupsAssignee: Zdenek Dohnal <zdohnal>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: cstratak, dmalcolm, gruszczynskit, gunite, m.cyprian, mhroncok, mikko.tiihonen, mplch, nicolas.mailhot, pviktori, ramprasadg, rkuska, shcherbina.iryna, slavek.kabrda, smfaraujo, tomspur, torsava, twaugh, vstinner, yevgensm, zdohnal
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
URL: https://retrace.fedoraproject.org/faf/reports/bthash/eab43af399483eadd1c6dc8dc61ca83d9e735996
Whiteboard: abrt_hash:c52aeb9c3dd51d9274f334a369b1ed3b9de81c93;VARIANT_ID=workstation;
Fixed In Version: python-cups-2.0.1-1.fc32 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-02 04:03:09 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
File: backtrace
none
File: core_backtrace
none
File: cpuinfo
none
File: dso_list
none
File: environ
none
File: exploitable
none
File: limits
none
File: maps
none
File: mountinfo
none
File: open_fds
none
File: proc_pid_status
none
lpinfo -l -v
none
/etc/cups/ppd contents none

Description Ovidio 2020-03-23 11:21:18 UTC
Version-Release number of selected component:
python3-3.8.2-2.fc32

Additional info:
reporter:       libreport-2.12.0
backtrace_rating: 4
cgroup:         0::/user.slice/user-1000.slice/user/dbus\x2d:1.2\x2dorg.fedoraproject.Config.Printing.slice/dbus-:1.2-org.fedoraproject.Config.Printing
cmdline:        /usr/bin/python3 /usr/share/system-config-printer/scp-dbus-service.py
crash_function: _PyObject_GC_UNTRACK_impl
executable:     /usr/bin/python3.8
journald_cursor: s=34761549a12340889ffd56c435bb24bb;i=9282;b=0a0cdc0459ab4cf583696385bf83e4d3;m=e3acac278;t=5a171ef936fde;x=c62ed5b5dae6e778
kernel:         5.6.0-0.rc5.git0.2.fc32.x86_64
rootdir:        /
runlevel:       N 5
type:           CCpp
uid:            1000
xsession_errors: 

Truncated backtrace:
Thread no. 1 (6 frames)
 #0 _PyObject_GC_UNTRACK_impl at /usr/src/debug/python3-3.8.2-2.fc32.x86_64/Include/internal/pycore_object.h:69
 #1 method_dealloc at /usr/src/debug/python3-3.8.2-2.fc32.x86_64/Objects/classobject.c:250
 #2 _Py_DECREF at /usr/include/python3.8/object.h:478
 #3 _Py_XDECREF at /usr/include/python3.8/object.h:541
 #4 destroy_TLS at cupsmodule.c:67
 #5 __nptl_deallocate_tsd at pthread_create.c:301

Comment 1 Ovidio 2020-03-23 11:21:25 UTC
Created attachment 1672622 [details]
File: backtrace

Comment 2 Ovidio 2020-03-23 11:21:28 UTC
Created attachment 1672623 [details]
File: core_backtrace

Comment 3 Ovidio 2020-03-23 11:21:32 UTC
Created attachment 1672624 [details]
File: cpuinfo

Comment 4 Ovidio 2020-03-23 11:21:35 UTC
Created attachment 1672625 [details]
File: dso_list

Comment 5 Ovidio 2020-03-23 11:21:38 UTC
Created attachment 1672626 [details]
File: environ

Comment 6 Ovidio 2020-03-23 11:21:40 UTC
Created attachment 1672627 [details]
File: exploitable

Comment 7 Ovidio 2020-03-23 11:21:43 UTC
Created attachment 1672628 [details]
File: limits

Comment 8 Ovidio 2020-03-23 11:21:46 UTC
Created attachment 1672629 [details]
File: maps

Comment 9 Ovidio 2020-03-23 11:21:49 UTC
Created attachment 1672630 [details]
File: mountinfo

Comment 10 Ovidio 2020-03-23 11:21:52 UTC
Created attachment 1672631 [details]
File: open_fds

Comment 11 Ovidio 2020-03-23 11:21:55 UTC
Created attachment 1672632 [details]
File: proc_pid_status

Comment 12 Zdenek Dohnal 2020-03-23 12:06:43 UTC
Hi Ovidio,

thank you for reporting the issue!

Would you mind providing the steps how to reproduce the bug?

Comment 13 Victor Stinner 2020-03-23 15:10:11 UTC
Two threads are using the Python C API at the same time. It should not happen. it seems like __nptl_deallocate_tsd() calls destroy_TLS() which uses the Python C API without holding the GIL.

Attached backtrace:

Thread 1 (Thread 0x7f8af7f31700 (LWP 44692)):
#0  0x00007f8b09115977 in _PyObject_GC_UNTRACK_impl (op=<method at remote 0x7f8afab1a2c0>, lineno=250, filename=0x7f8b091e7058 "/builddir/build/BUILD/Python-3.8.2/Objects/classobject.c") at /usr/src/debug/python3-3.8.2-2.fc32.x86_64/Include/internal/pycore_object.h:69
        gc = 0x7f8afab1a2b0
        prev = 0x4
        next = 0x7f8af8b82d30
#1  method_dealloc (im=0x7f8afab1a2c0) at /usr/src/debug/python3-3.8.2-2.fc32.x86_64/Objects/classobject.c:250
#2  0x00007f8afa3bc485 in _Py_DECREF (op=<optimized out>, lineno=541, filename=<synthetic pointer>) at /usr/include/python3.8/object.h:478
        tls = 0x7f8af0002960
#3  _Py_XDECREF (op=<optimized out>) at /usr/include/python3.8/object.h:541
        tls = 0x7f8af0002960
#4  destroy_TLS (value=0x7f8af0002960) at cupsmodule.c:67
        tls = 0x7f8af0002960
#5  0x00007f8b08fde251 in __nptl_deallocate_tsd () at pthread_create.c:301
        data = <optimized out>
        level2 = 0x7f8af7f31a10
        idx = <optimized out>
        round = 0
        cnt = 0
#6  __nptl_deallocate_tsd () at pthread_create.c:250
(...)

Thread 2 (Thread 0x7f8b08e7f740 (LWP 44688)):
#0  gc_reset_refs (refs=1, g=0x7f8af63ae3f0) at /usr/src/debug/python3-3.8.2-2.fc32.x86_64/Modules/gcmodule.c:102
        gc = 0x7f8af63ae3f0
        i = 2
        m = 0
        n = 0
        young = 0x7f8b0930a640 <_PyRuntime+416>
        old = 0x7f8b0930a640 <_PyRuntime+416>
        unreachable = {_gc_next = 94290423069312, _gc_prev = 140234908081280}
        finalizers = {_gc_next = 140234908096880, _gc_prev = 140235131333088}
        gc = <optimized out>
        t1 = 0
        stats = <optimized out>
#1  update_refs (containers=0x7f8b0930a640 <_PyRuntime+416>) at /usr/src/debug/python3-3.8.2-2.fc32.x86_64/Modules/gcmodule.c:351
        gc = 0x7f8af63ae3f0
        i = 2
        m = 0
        n = 0
        young = 0x7f8b0930a640 <_PyRuntime+416>
        old = 0x7f8b0930a640 <_PyRuntime+416>
        unreachable = {_gc_next = 94290423069312, _gc_prev = 140234908081280}
        finalizers = {_gc_next = 140234908096880, _gc_prev = 140235131333088}
        gc = <optimized out>
        t1 = 0
        stats = <optimized out>
#2  collect (generation=2, n_collected=0x0, n_uncollectable=0x0, nofail=1, state=0x7f8b0930a5f8 <_PyRuntime+344>) at /usr/src/debug/python3-3.8.2-2.fc32.x86_64/Modules/gcmodule.c:1053
        i = 2
        m = 0
        n = 0
        young = 0x7f8b0930a640 <_PyRuntime+416>
        old = 0x7f8b0930a640 <_PyRuntime+416>
        unreachable = {_gc_next = 94290423069312, _gc_prev = 140234908081280}
        finalizers = {_gc_next = 140234908096880, _gc_prev = 140235131333088}
        gc = <optimized out>
        t1 = 0
        stats = <optimized out>
(...)

Comment 14 Zdenek Dohnal 2020-04-07 05:13:40 UTC
*** Bug 1781523 has been marked as a duplicate of this bug. ***

Comment 15 Zdenek Dohnal 2020-04-07 05:13:59 UTC
*** Bug 1821324 has been marked as a duplicate of this bug. ***

Comment 16 Zdenek Dohnal 2020-04-07 05:14:02 UTC
*** Bug 1821428 has been marked as a duplicate of this bug. ***

Comment 17 Ovidio 2020-04-09 15:50:56 UTC
What is the info you need?!

Comment 18 Miro HronĨok 2020-04-09 16:10:37 UTC
(In reply to Ovidio from comment #17)
> What is the info you need?!

Would you mind providing the steps how to reproduce the bug?

Comment 19 Zdenek Dohnal 2020-04-20 15:11:14 UTC
*** Bug 1825956 has been marked as a duplicate of this bug. ***

Comment 20 Zdenek Dohnal 2020-04-20 15:11:21 UTC
*** Bug 1824001 has been marked as a duplicate of this bug. ***

Comment 21 Zdenek Dohnal 2020-04-20 15:11:39 UTC
*** Bug 1825412 has been marked as a duplicate of this bug. ***

Comment 22 Zdenek Dohnal 2020-04-20 15:11:57 UTC
*** Bug 1825612 has been marked as a duplicate of this bug. ***

Comment 23 Zdenek Dohnal 2020-04-21 05:32:30 UTC
Anyone affected by the issue:

Are you able to reproduce the issue again?

Do you use system-config-printer? 

Did you install a print queue (regardless of how) during the session when the issue happened?

Did you have any printer available on the network or USB interface during the session when the issue happened?

Would you mind sharing output of 'sudo lpinfo -l -v' when you are in the same situation as when the crash happened (f.e. printers turned on, the same network)?

Would you mind attaching your files from /etc/cups/ppd/ dir?

The crash seems to happen in scp-dbus-service (uses python-cups, where segfault originates) which is called by gnome-settings-daemon when it wants to find drivers or checking missing executables.

Maybe if I will be able to simulate how gnome-settings-daemon calls scp-dbus-service (finding drivers needs device id, which is in 'sudo lpinfo -l -v' output, missing executables functionality needs ppds of found devices), I'll get the crash too.

Comment 24 Zdenek Dohnal 2020-04-21 06:40:41 UTC
It seems thread local storage implementation in pycups will need to be rewritten to match PEP 539.

Comment 25 Mikko Tiihonen 2020-04-21 09:59:33 UTC
I used the gnome printer to add a new printer I just bought.
The HP printer was connected over network.
It found the printer twice. And I tried to edit both by clicking the find drivers option (expecting to get hplip on one of them).
I have tried to retrace the steps I took but have not been able to reproduce the crash.
I'll attach the lpinfo and ppd files.

Comment 26 Mikko Tiihonen 2020-04-21 10:00:27 UTC
Created attachment 1680475 [details]
lpinfo -l -v

Comment 27 Mikko Tiihonen 2020-04-21 10:01:20 UTC
Created attachment 1680477 [details]
/etc/cups/ppd contents

Comment 28 Zdenek Dohnal 2020-04-22 08:04:36 UTC
Mikko,

thank you for the info!

 (In reply to Mikko Tiihonen from comment #25)
> It found the printer twice. And I tried to edit both by clicking the find
> drivers option (expecting to get hplip on one of them).

FYI your printer supports driverless printing, so you don't need hplip for printing and install printer at all - you can just use temporary queue in print dialogs (the window which shows up when you want to print from evince/firefox/libreoffice - I prefer libreoffice since it uses CUPS instead of GTK, which can malfunction sometimes - see https://bugzilla.redhat.com/show_bug.cgi?id=1784449 ) or let cups-browsed work for you and create a permanent queue automatically.

You may need hplip for scanning but it will (hopefully) not be true for long - see https://github.com/alexpevzner/sane-airscan .

Comment 29 Zdenek Dohnal 2020-04-22 08:53:51 UTC
So, a progress:

I was finally able to reproduce when I ran:

dbus-send --session --print-reply --reply-timeout=120000 --type=method_call --dest='org.fedoraproject.Config.Printing' '/org/fedoraproject/Config/Printing' org.fedoraproject.Config.Printing.GetBestDrivers string:"MFG:HP;MDL:OfficeJet Pro 9020 series;CMD:PCL,JPEG,URF,PWG;" string:"HP OfficeJet Pro 9020 series" string:"dnssd://HP%20OfficeJet%20Pro%209020%20series%20%5BCC4534%5D._ipp._tcp.local/?uuid=30136a04-42a5-5698-f0ba-a0e75dd62164"

, but only for the first time - I'm not able to reproduce after that...

Comment 30 Zdenek Dohnal 2020-04-22 11:11:11 UTC
Hi Victor,

I'm not sure whether I used the correct Python API for solving the issue - I chose the first approach from https://docs.python.org/3/c-api/init.html#thread-state-and-the-global-interpreter-lock because pycups is python extension for CUPS and it is called from scp-dbus-service.py, which is written in python, so I suppose the first approach can be used.

Would you mind correcting me if I'm wrong?

Would you mind reviewing the patch:

diff --git a/cupsmodule.c b/cupsmodule.c
index 26a17f9..6cfd94b 100644
--- a/cupsmodule.c
+++ b/cupsmodule.c
@@ -64,10 +64,14 @@ static void
 destroy_TLS (void *value)
 {
   struct TLS *tls = (struct TLS *) value;
+  Py_BEGIN_ALLOW_THREADS
   Py_XDECREF (tls->cups_password_callback);
+  Py_END_ALLOW_THREADS
 
 #if HAVE_CUPS_1_4
+  Py_BEGIN_ALLOW_THREADS
   Py_XDECREF (tls->cups_password_callback_context);
+  Py_END_ALLOW_THREADS
 #endif /* HAVE_CUPS_1_4 */
 
   free (value);

Comment 31 Victor Stinner 2020-04-22 12:26:31 UTC
+  Py_BEGIN_ALLOW_THREADS
   Py_XDECREF (tls->cups_password_callback);
+  Py_END_ALLOW_THREADS

is wrong: this code releases the GIL to call Py_XDECREF() which is part of the C API. You must hold the GIL to use the C API.

You should get the GIL as early as possible, maybe in this code:

#4  destroy_TLS (value=0x7f8af0002960) at cupsmodule.c:67
        tls = 0x7f8af0002960

You can use:

    PyGILState_STATE gilstate;

    gilstate = PyGILState_Ensure();
    ... use the C API here ...
    PyGILState_Release(gilstate);

Doc: https://docs.python.org/dev/c-api/init.html?highlight=pygilstate_ensure#c.PyGILState_Ensure

Comment 32 Zdenek Dohnal 2020-04-22 12:38:58 UTC
Damn, so it was other way around, thank you for the correction!

Comment 33 Fedora Update System 2020-04-24 04:54:29 UTC
FEDORA-2020-c6b43f347f has been submitted as an update to Fedora 32. https://bodhi.fedoraproject.org/updates/FEDORA-2020-c6b43f347f

Comment 34 Fedora Update System 2020-04-25 12:20:56 UTC
FEDORA-2020-c6b43f347f has been pushed to the Fedora 32 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-c6b43f347f`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-c6b43f347f

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 35 Fedora Update System 2020-05-02 04:03:09 UTC
FEDORA-2020-c6b43f347f has been pushed to the Fedora 32 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 36 Zdenek Dohnal 2020-05-04 06:42:50 UTC
*** Bug 1828389 has been marked as a duplicate of this bug. ***

Comment 37 Zdenek Dohnal 2020-05-04 06:57:27 UTC
*** Bug 1830234 has been marked as a duplicate of this bug. ***