Bug 2279088 - Segmentation fault in PyType_Ready called from Shiboken::init() since python3.12-3.12.3-2
Summary: Segmentation fault in PyType_Ready called from Shiboken::init() since python3...
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: python3.12
Version: 39
Hardware: aarch64
OS: Linux
high
medium
Target Milestone: ---
Assignee: Python Maintainers
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-05-04 19:23 UTC by Teoh Han Hui
Modified: 2024-06-19 12:13 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-06-19 12:13:39 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
coredumpctl debug output (36.31 KB, text/plain)
2024-05-04 19:25 UTC, Teoh Han Hui
no flags Details
dnf5 history output (6.21 KB, text/plain)
2024-05-04 19:27 UTC, Teoh Han Hui
no flags Details

Description Teoh Han Hui 2024-05-04 19:23:31 UTC
Since the upgrade to python3.12-3.12.3-2, I'm no longer able to launch Syncplay at all.

Things that I have tried:

* Make sure Syncplay dependencies are upgraded:
```
pip install --upgrade -r requirements.txt
pip install --upgrade -r requirements_gui.txt
```

* Rebuild Syncplay:
```
make SINGLE_USER=1 install
```

None of the above has helped.

Reproducible: Always

Steps to Reproduce:
1.
```
curl -fsSLo syncplay-1.7.3.tar.gz https://github.com/Syncplay/syncplay/archive/refs/tags/v1.7.3.tar.gz
tar -xzf syncplay-1.7.3.tar.gz
cd syncplay-1.7.3
mkdir ~/.venv
python3 -m venv ~/.venv/syncplay
. ~/.venv/syncplay/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
pip install -r requirements_gui.txt
make SINGLE_USER=1 install
```
Actual Results:  
Segmentation fault (core dumped)

Expected Results:  
Syncplay launches.

Comment 1 Teoh Han Hui 2024-05-04 19:25:43 UTC
Created attachment 2031305 [details]
coredumpctl debug output

Comment 2 Teoh Han Hui 2024-05-04 19:27:26 UTC
Created attachment 2031306 [details]
dnf5 history output

Comment 3 Teoh Han Hui 2024-05-04 19:40:28 UTC
Steps to reproduce is missing step 2 - run `syncplay`

Comment 4 Miro Hrončok 2024-05-06 10:52:32 UTC
Fort he record, I cannot reproduce on x86_64.

Comment 5 Alex Pyrgiotis 2024-05-08 17:07:13 UTC
We (Dangerzone devs) are bitten by this bug as well. Our nightly builds for Fedora 39-40 started failing on May 5th (a few hours after the -2 patch was pushed to stable). You can reproduce it on x86_64 by doing the following:

1. Checkout the Dangerzone source: https://github.com/freedomofpress/dangerzone
2. Follow our build instructions on a Fedora 39/40 machine: https://github.com/freedomofpress/dangerzone/blob/main/BUILD.md#fedora
   * You can omit the image building part, since it takes a bit of time, and it's not necessary for showcasing this issue.
3. Run `PYTHONFAULTHANDLER=1 poetry run ./dev_scripts/dangerzone`. It should fail with:

   ```
   Fatal Python error: Segmentation fault
   [... stack trace ...]
   ```

One of our Python dependencies is PySide6, which I see is also used by Syncplay as well.

Comment 6 Miro Hrončok 2024-05-08 19:01:34 UTC
See also https://bugreports.qt.io/projects/PYSIDE/issues/PYSIDE-2747

Comment 7 Alex Pyrgiotis 2024-05-09 17:14:57 UTC
Hm, that's interesting. I get that Friedermann (the PySide dev) cannot reproduce it in Ubuntu - it's actually reproducible only for python3-3.12.3-2 in Fedora. However, it's weird that you cannot reproduce it in your own environment.

So, here's the simplest way I can think of to reproduce this issue. I have created the following Containerfile:

```
FROM fedora:39

RUN dnf install -y python3-pip
RUN pip install shiboken6==6.7.0
RUN python3 -c 'import shiboken6'
```

You can run this Containerfile with `podman build --pull -f <file>`. In my machine, it ultimately fails with:

```
[...]
STEP 5/5: RUN python3 -c 'import shiboken6'
container exited on segmentation fault
Error: building at STEP "RUN python3 -c 'import shiboken6'": while running runtime: exit status 1
```

Can you reproduce this as well?

Comment 8 Teoh Han Hui 2024-05-09 17:52:10 UTC
Can the update be reverted for now in light of this regression?

Comment 9 Miro Hrončok 2024-05-10 09:03:37 UTC
I can indeed reproduce this in a container. But not on my machine :/

> Can the update be reverted for now in light of this regression?

For the sake of one pip-installed package? That sounds like an overreaction to me. You can always fetch older Python from Koji: https://koji.fedoraproject.org/koji/buildinfo?buildID=2411847

Comment 10 Miro Hrončok 2024-05-10 10:32:24 UTC
[root@04d492cba088 /]# python3.12d -m venv venv
[root@04d492cba088 /]# . venv/bin/activate
(venv) [root@04d492cba088 /]# pip install shiboken6==6.7.0
(venv) [root@04d492cba088 /]# python -c 'import shiboken6'
python: /builddir/build/BUILD/Python-3.12.3/Include/internal/pycore_object.h:83: _Py_ClearImmortal: Assertion `op->ob_refcnt == _Py_IMMORTAL_REFCNT' failed.
Aborted (core dumped)


In gdb:

python: /builddir/build/BUILD/Python-3.12.3/Include/internal/pycore_object.h:83: _Py_ClearImmortal: Assertion `op->ob_refcnt == _Py_IMMORTAL_REFCNT' failed.

Program received signal SIGABRT, Aborted.
__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
44	      return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
(gdb) bt
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x00007f6790aae8a3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2  0x00007f6790a5c8ee in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007f6790a448ff in __GI_abort () at abort.c:79
#4  0x00007f6790a4481b in __assert_fail_base (fmt=0x7f6790bc3af8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7f6791218bf8 "op->ob_refcnt == _Py_IMMORTAL_REFCNT", 
    file=file@entry=0x7f679120e338 "/builddir/build/BUILD/Python-3.12.3/Include/internal/pycore_object.h", line=line@entry=83, function=function@entry=0x7f67911333d0 <__PRETTY_FUNCTION__.66.lto_priv.3> "_Py_ClearImmortal")
    at assert.c:92
#5  0x00007f6790a54c57 in __assert_fail (assertion=0x7f6791218bf8 "op->ob_refcnt == _Py_IMMORTAL_REFCNT", file=0x7f679120e338 "/builddir/build/BUILD/Python-3.12.3/Include/internal/pycore_object.h", line=83, 
    function=0x7f67911333d0 <__PRETTY_FUNCTION__.66.lto_priv.3> "_Py_ClearImmortal") at assert.c:101
#6  0x00007f6790e93a52 in _Py_ClearImmortal (op=(<type at remote 0x7f679135bb80>,)) at /usr/src/debug/python3.12-3.12.3-2.fc39.x86_64/Include/internal/pycore_object.h:83
#7  0x00007f6790e94f9f in clear_tp_bases (self=0x7f679135b9e0 <PyType_Type>) at /usr/src/debug/python3.12-3.12.3-2.fc39.x86_64/Objects/typeobject.c:322
#8  0x00007f6790ea0419 in clear_static_type_objects (interp=0x7f6791443448 <_PyRuntime+92392>, type=0x7f679135b9e0 <PyType_Type>) at /usr/src/debug/python3.12-3.12.3-2.fc39.x86_64/Objects/typeobject.c:5014
#9  0x00007f6790ea04da in _PyStaticType_Dealloc (interp=0x7f6791443448 <_PyRuntime+92392>, type=0x7f679135b9e0 <PyType_Type>) at /usr/src/debug/python3.12-3.12.3-2.fc39.x86_64/Objects/typeobject.c:5027
#10 0x00007f6790e795ae in _PyTypes_FiniTypes (interp=0x7f6791443448 <_PyRuntime+92392>) at /usr/src/debug/python3.12-3.12.3-2.fc39.x86_64/Objects/object.c:2185
#11 0x00007f67910112e5 in finalize_interp_types (interp=0x7f6791443448 <_PyRuntime+92392>) at /usr/src/debug/python3.12-3.12.3-2.fc39.x86_64/Python/pylifecycle.c:1712
#12 0x00007f67910113e9 in finalize_interp_clear (tstate=0x7f67914a0d78 <_PyRuntime+475672>) at /usr/src/debug/python3.12-3.12.3-2.fc39.x86_64/Python/pylifecycle.c:1761
#13 0x00007f6791011571 in Py_FinalizeEx () at /usr/src/debug/python3.12-3.12.3-2.fc39.x86_64/Python/pylifecycle.c:1971
#14 0x00007f6791056d40 in Py_RunMain () at /usr/src/debug/python3.12-3.12.3-2.fc39.x86_64/Modules/main.c:711
#15 0x00007f6791056e11 in pymain_main (args=0x7ffccc204380) at /usr/src/debug/python3.12-3.12.3-2.fc39.x86_64/Modules/main.c:739
#16 0x00007f6791056ed9 in Py_BytesMain (argc=3, argv=0x7ffccc2044f8) at /usr/src/debug/python3.12-3.12.3-2.fc39.x86_64/Modules/main.c:763
#17 0x00005584e27c717d in main (argc=3, argv=0x7ffccc2044f8) at /usr/src/debug/python3.12-3.12.3-2.fc39.x86_64/Programs/python.c:15

Comment 11 Alex Pyrgiotis 2024-05-13 08:24:02 UTC
That's interesting. Your traceback shows that Shiboken6 (?) dropped the reference of an immortal object (like `None`, `True`, etc.). Any clue why this happens with this specific Python build, and not with any others?

Comment 12 Miro Hrončok 2024-05-13 08:43:54 UTC
> Any clue why this happens with this specific Python build, and not with any others?

No idea yet.

Comment 13 Victor Stinner 2024-05-13 12:50:51 UTC
I can reproduce the bug on Fedora 40 with Python 3.12 built from source in debug mode ("pydebug").

The problem comes from PyType_Type.tp_bases.ob_refcnt. If I put a breakpoint at Python startup (at PyRun_Main() for example), I see that Pep384_Init() of .../site-packages/shiboken6/libshiboken6.abi3.so.6.7 mess up PyType_Type.tp_bases.ob_refcnt: it decrements its value.

For Linux x86-64, shiboken6-6.7.0-cp39-abi3-manylinux_2_28_x86_64.whl is available: "Uploaded Apr 9, 2024 CPython 3.9+ manylinux: glibc 2.28+". It was built with Python 3.9 which doesn't know immortal objects.

I'm now trying to find the source code of Pep384_Init().

Comment 14 Victor Stinner 2024-05-13 12:54:02 UTC
> I'm now trying to find the source code of Pep384_Init().

I suppose that the issue comes from: https://code.qt.io/cgit/pyside/pyside-setup.git/tree/sources/shiboken6/libshiboken/pep384impl.cpp

check_PyTypeObject_valid():

    auto *probe_tp_bases = PyObject_GetAttr(obtype, Shiboken::PyMagicName::bases());
    ...
    Py_DECREF(probe_tp_bases);

Comment 15 Victor Stinner 2024-05-13 13:03:01 UTC
In Python 3.11, _PyStaticType_Dealloc() just calls:

    Py_CLEAR(type->tp_bases);

In Python 3.12, it calls more complicated code:

    static inline void
    clear_tp_bases(PyTypeObject *self)
    {
        if (self->tp_flags & _Py_TPFLAGS_STATIC_BUILTIN) {
            if (_Py_IsMainInterpreter(_PyInterpreterState_GET())) {
                if (self->tp_bases != NULL) {
                    if (PyTuple_GET_SIZE(self->tp_bases) == 0) {
                        Py_CLEAR(self->tp_bases);
                    }
                    else {
                        assert(_Py_IsImmortal(self->tp_bases));
                        _Py_ClearImmortal(self->tp_bases);
                    }
                }
            }
            return;
        }
        Py_CLEAR(self->tp_bases);
    }

with:

    /* _Py_ClearImmortal() should only be used during runtime finalization. */
    static inline void _Py_ClearImmortal(PyObject *op)
    {
        if (op) {
            assert(op->ob_refcnt == _Py_IMMORTAL_REFCNT);
            op->ob_refcnt = 1;
            Py_DECREF(op);
        }
    }
    #define _Py_ClearImmortal(op) \
        do { \
            _Py_ClearImmortal(_PyObject_CAST(op)); \
            op = NULL; \
        } while (0)

The difference is that Python 3.12 now checks "assert(op->ob_refcnt == _Py_IMMORTAL_REFCNT);" which becomes wrong when the shiboken6 module is imported.

Comment 16 Victor Stinner 2024-05-13 13:45:05 UTC
Oh, in the container, I get a different bug:
---
(gdb) where
#0  PyType_Ready (type=0x0) at /usr/src/debug/python3.12-3.12.3-2.fc39.x86_64/Objects/typeobject.c:7536
#1  0x00007f3583872db7 in Shiboken::init() () from /usr/local/lib64/python3.12/site-packages/shiboken6/libshiboken6.abi3.so.6.7
#2  0x00007f3583874466 in Shiboken::Module::create(char const*, void*) ()
   from /usr/local/lib64/python3.12/site-packages/shiboken6/libshiboken6.abi3.so.6.7
#3  0x00007f35838a71f1 in PyInit_Shiboken () from /usr/local/lib64/python3.12/site-packages/shiboken6/Shiboken.abi3.so
#4  0x00007f35843b4806 in _PyImport_LoadDynamicModuleWithSpec (fp=<optimized out>, spec=0x7f358392c140)
    at /usr/src/debug/python3.12-3.12.3-2.fc39.x86_64/Python/importdl.c:169
---

Comment 17 Victor Stinner 2024-05-13 13:54:21 UTC
There are two issues:

* On a Python release build, there is a crash on PyType_Ready (type=0x0) at *startup*.
* On a Python debug build, an assertion fails *at exit*: I created https://github.com/python/cpython/issues/118997 to track this issue.

For the release build, I failed to reproduce the issue on Fedora 40 with:

* old: python3-3.12.2-2.fc40.x86_64
* latest: python3-3.12.3-2.fc40.x86_64

I can reproduce in a Fedora 39 container with:

* python3-3.12.3-2.fc39.x86_64

So far, the best reproducer is comment 7 using a container: https://bugzilla.redhat.com/show_bug.cgi?id=2279088#c7

Comment 18 Ben Beasley 2024-05-14 13:53:36 UTC
There is an “Ask Fedora” discussion thread that looks like it is probably about this bug: https://discussion.fedoraproject.org/t/after-updating-fedora-40-pyside6-in-python-it-does-not-work/116359

Comment 19 Alex Pyrgiotis 2024-05-20 10:48:28 UTC
Promising update, it seems that the nightly Pyside6 wheels no longer fail: https://bugreports.qt.io/browse/PYSIDE-2747?focusedId=794897&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-794897

Comment 20 Ben Beasley 2024-05-25 13:47:01 UTC
(In reply to Alex Pyrgiotis from comment #19)
> Promising update, it seems that the nightly Pyside6 wheels no longer fail:
> https://bugreports.qt.io/browse/PYSIDE-2747?focusedId=794897&page=com.
> atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-794897

The discussion thread now has a report that the 6.7.1 release of PySide6 does fix the problem: https://discussion.fedoraproject.org/t/after-updating-fedora-40-pyside6-in-python-it-does-not-work/116359/12


Note You need to log in before you can comment on or make changes to this bug.