Bug 2250649 - sip-build fails with Python3.13: RecursionError: maximum recursion depth exceeded in comparison
Summary: sip-build fails with Python3.13: RecursionError: maximum recursion depth exce...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: python3.13
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Python Maintainers
QA Contact:
URL:
Whiteboard:
: 2247290 (view as bug list)
Depends On:
Blocks: PYTHON3.13 2248129 2247290
TreeView+ depends on / blocked
 
Reported: 2023-11-20 10:35 UTC by Karolina Surma
Modified: 2024-04-24 12:08 UTC (History)
7 users (show)

Fixed In Version: sip6-6.8.3-1.fc41
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-04-24 12:08:04 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github python cpython issues 116647 0 None open Python 3.13 regression: Recursive dataclasses fail to ==: RecursionError: maximum recursion depth exceeded 2024-03-12 10:37:07 UTC

Description Karolina Surma 2023-11-20 10:35:56 UTC
When building python-qt5 and python-qt6, sip-build fails to build the packages with Python 3.13.0a1.

...
sip-build: An internal error occurred...
Traceback (most recent call last):
  File "/usr/bin/sip-build", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/lib64/python3.13/site-packages/sipbuild/tools/build.py", line 37, in main
    handle_exception(e)
  File "/usr/lib64/python3.13/site-packages/sipbuild/exceptions.py", line 81, in handle_exception
    raise e
  File "/usr/lib64/python3.13/site-packages/sipbuild/tools/build.py", line 34, in main
    project.build()
  File "/usr/lib64/python3.13/site-packages/sipbuild/project.py", line 245, in build
    self.builder.build()
  File "/usr/lib64/python3.13/site-packages/sipbuild/builder.py", line 48, in build
    self._generate_bindings()
  File "/usr/lib64/python3.13/site-packages/sipbuild/builder.py", line 280, in _generate_bindings
    buildable = bindings.generate()
                ^^^^^^^^^^^^^^^^^^^
  File "/builddir/build/BUILD/PyQt5-5.15.9/project.py", line 619, in generate
    buildable = super().generate()
                ^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.13/site-packages/sipbuild/bindings.py", line 214, in generate
    output_pyi(spec, project, pyi_path)
  File "/usr/lib64/python3.13/site-packages/sipbuild/generator/outputs/pyi.py", line 53, in output_pyi
    _module(pf, spec, module)
  File "/usr/lib64/python3.13/site-packages/sipbuild/generator/outputs/pyi.py", line 132, in _module
    _class(pf, spec, module, klass, defined)
  File "/usr/lib64/python3.13/site-packages/sipbuild/generator/outputs/pyi.py", line 267, in _class
    _class(pf, spec, module, nested, defined, indent)
  File "/usr/lib64/python3.13/site-packages/sipbuild/generator/outputs/pyi.py", line 289, in _class
    _callable(pf, spec, module, member, klass.overloads,
  File "/usr/lib64/python3.13/site-packages/sipbuild/generator/outputs/pyi.py", line 485, in _callable
    _overload(pf, spec, module, overload, overloaded, first_overload,
  File "/usr/lib64/python3.13/site-packages/sipbuild/generator/outputs/pyi.py", line 575, in _overload
    signature = _python_signature(spec, module, py_signature, defined,
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.13/site-packages/sipbuild/generator/outputs/pyi.py", line 599, in _python_signature
    as_str = _argument(spec, module, arg, defined, arg_nr=arg_nr)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.13/site-packages/sipbuild/generator/outputs/pyi.py", line 676, in _argument
    s += _type(spec, module, arg, defined, out=out)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.13/site-packages/sipbuild/generator/outputs/pyi.py", line 710, in _type
    return ArgumentFormatter(spec, arg).as_type_hint(module, out, defined)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.13/site-packages/sipbuild/generator/outputs/formatters/argument.py", line 327, in as_type_hint
    s += TypeHintManager(self.spec).as_type_hint(hint, out, context,
         ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.13/site-packages/sipbuild/generator/outputs/type_hints.py", line 107, in __new__
    manager = cls._spec_manager_map[spec]
              ~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/usr/lib64/python3.13/weakref.py", line 415, in __getitem__
    return self.data[ref(key)]
           ~~~~~~~~~^^^^^^^^^^
  File "<string>", line 4, in __eq__
  File "<string>", line 4, in __eq__
  File "<string>", line 4, in __eq__
  [Previous line repeated 495 more times]
RecursionError: maximum recursion depth exceeded in comparison



For the build logs, see:
https://copr-be.cloud.fedoraproject.org/results/@python/python3.13/fedora-rawhide-x86_64/06584613-python-qt5/

For all our attempts to build python-qt5 with Python 3.13, see:
https://copr.fedorainfracloud.org/coprs/g/python/python3.13/package/python-qt5/

Testing and mass rebuild of packages is happening in copr.
You can follow these instructions to test locally in mock if your package builds with Python 3.13:
https://copr.fedorainfracloud.org/coprs/g/python/python3.13/

Let us know here if you have any questions.

Python 3.13 is planned to be included in Fedora 41.
To make that update smoother, we're building Fedora packages with all pre-releases of Python 3.13.
A build failure prevents us from testing all dependent packages (transitive [Build]Requires),
so if this package is required a lot, it's important for us to get it fixed soon.

We'd appreciate help from the people who know this package best,
but if you don't want to work on this now, let us know so we can try to work around it on our side.

Comment 1 Scott Talbert 2024-01-02 02:24:44 UTC
I have looked into this a little bit.  I asked sip upstream and they don't have any ideas about it.  Given that this seems to be occurring in cpython code itself (and it looks like there were some changes in Python 3.13 relating to weak references), I would lean towards this being a regression in cpython itself.  It's not immediately clear how to produce a reproducer, though.  I started trying to bisect cpython, but it seems to be a bit challenging.

Comment 2 Aoife Moloney 2024-02-15 23:04:56 UTC
This bug appears to have been reported against 'rawhide' during the Fedora Linux 40 development cycle.
Changing version to 40.

Comment 3 Miro Hrončok 2024-03-11 08:24:33 UTC
Scott, have you made any progress? Should we try bisecting cpython ourselves?

Comment 4 Scott Talbert 2024-03-11 13:05:12 UTC
I have not made any progress, unfortunately.  I did start to do a bisect on cpython, but I didn't get very far.  If you or anyone else wants to give it a go, please feel free.  :-)

Comment 5 Miro Hrončok 2024-03-11 13:21:59 UTC
I don't exactly *want* to, but this is transitively blocking quite a lot of packages, so I probably will have to :(

Comment 6 Scott Talbert 2024-03-12 03:49:45 UTC
OK, I was able to bisect this.

18cfc1eea569f0ce72ad403840c0e6cc5f81e1c2 is the first bad commit
commit 18cfc1eea569f0ce72ad403840c0e6cc5f81e1c2
Author: Raymond Hettinger <rhettinger.github.com>
Date:   Tue May 30 11:35:30 2023 -0500

    Small speedup for dataclass __eq__ and __repr__ (#104904)
    
    Faster __repr__ with str.__add__ moved inside the f-string. For __eq__ comp;are field by field instead of building temporary tuples.
    
    Co-authored-by: Shantanu <12621235+hauntsaninja.github.com>

 Lib/dataclasses.py | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

I haven't done much further analysis yet, time for bed.

Comment 7 Miro Hrončok 2024-03-12 09:13:14 UTC
Thank you! That commit seems to be trivial to revert. I'll try that in our Copr today and report this to CPython upstream.

Comment 8 Miro Hrončok 2024-03-12 10:37:07 UTC
I found a simple reproducer and reported https://github.com/python/cpython/issues/116647

Comment 9 Scott Talbert 2024-03-12 13:11:06 UTC
(In reply to Miro Hrončok from comment #7)
> Thank you! That commit seems to be trivial to revert. I'll try that in our
> Copr today and report this to CPython upstream.

Thanks for digging in and finding a reproducer!

Unfortunately, it looks like simply reverting that commit on top of current 'main' doesn't work, as I noticed in a local build and it seems like your copr build is also stuck.  In my case, my system ran out of memory and sip-build got killed by the OOM killer.

Comment 10 Miro Hrončok 2024-03-12 13:28:21 UTC
Indeed, it seems the python-qt5 and python-pyqt6 builds in copr are stuck.

Comment 11 Miro Hrončok 2024-03-13 22:47:05 UTC
I successfully bisected the performance degradation/hang/OOM kill...


https://github.com/python/cpython/commit/217f47d6e5e56bca78b8556e910cd00890f6f84a is the first new commit
commit 217f47d6e5e56bca78b8556e910cd00890f6f84a
Author: Dong-hee Na <donghee.na>
Date:   Thu Jul 6 07:19:49 2023 +0900

    gh-96844: Improve error message of list.remove (gh-106455)

 Doc/library/doctest.rst                                             | 6 +++---
 Lib/test/test_xml_etree.py                                          | 2 +-
 .../Core and Builtins/2023-07-06-00-35-44.gh-issue-96844.kwvoS-.rst | 1 +
 Objects/listobject.c                                                | 2 +-
 4 files changed, 6 insertions(+), 5 deletions(-)
 create mode 100644 Misc/NEWS.d/next/Core and Builtins/2023-07-06-00-35-44.gh-issue-96844.kwvoS-.rst


I am now testing a build with the commit reverted for good measure.

My builds are in https://copr.fedorainfracloud.org/coprs/g/python/python3.13/builds/?dirname=python3.13:custom:bisect

---

I have no idea why this change might cause this, but perhaps the %R in the format string causes a call to PyObject_Repr which is slow somehow?

Comment 12 Miro Hrončok 2024-03-13 23:21:05 UTC
sipbuild/generator/parser/parser_manager.py has:

    def _find_class_with_iface_file(self, iface_file, tmpl_arg=False):
        """ Return a WrappedClass object for an interface file creating it if
        necessary.
        """

        # See if it already exists.
        for klass in self.spec.classes:
            if klass.iface_file is iface_file:
                if not self.parsing_template:
                    try:
                        self._template_arg_classes.remove(klass)
                    except ValueError:
                        pass

                return klass
        ...

self._template_arg_classes is a list. Perhaps the `try: .remove except ValueError:` takes too long?

----

Nevertheless, reverting that commit from 3.13.0a5 does not seem to help :/

Comment 13 Scott Talbert 2024-03-13 23:24:25 UTC
After reverting that commit the build of python-qt5 still hangs?

Comment 14 Miro Hrončok 2024-03-13 23:29:44 UTC
> After reverting that commit the build of python-qt5 still hangs?

It got killed by OOM. But perhaps I somehow managed to testa a wrong build? I will ty again with a clear head tomorrow.


One more random idea -- changing the _find_class_with_iface_file function to do:

                    if klass in self._template_arg_classes:
                        self._template_arg_classes.remove(klass)

instead of try-except.

And now it seems it actually works (at least it is getting further in the build).

Comment 15 Miro Hrončok 2024-03-13 23:37:30 UTC
(In reply to Miro Hrončok from comment #14)
> > After reverting that commit the build of python-qt5 still hangs?
> 
> It got killed by OOM. But perhaps I somehow managed to testa a wrong build?
> I will ty again with a clear head tomorrow.

Oh my god. I used `patch -p1 -R` in %prep and the thing reverted the patch in a wrong location -- implementation of the index method matched the hunk.

Compare:

https://github.com/python/cpython/blob/v3.13.0a5/Objects/listobject.c#L2988-L2994
https://github.com/python/cpython/blob/v3.13.0a5/Objects/listobject.c#L3060-L3065

Will verify my second attempt before building.

Comment 17 Scott Talbert 2024-03-13 23:50:12 UTC
Fingers crossed your fixed attempt works.

Thinking about the real root cause of the OOM issue - I'm suspecting it may be another recursive dataclasses bug, probably with __repr__.  I CTRL-C'd my build while it was in the middle of eating memory and it was deep in a __repr__ call.

Comment 18 Miro Hrončok 2024-03-14 00:05:32 UTC
> I'm suspecting it may be another recursive dataclasses bug, probably with __repr__.

I suspected the same, but a simple recursive dataclass used in the previous reproducer reprs just fine :/


Good news: Avoiding the try-remove-except in sip6 makes python-qt5 build successfully.

Build with the properly reverted commit 
https://github.com/python/cpython/commit/217f47d6e5e56bca78b8556e910cd00890f6f84a seems to work as well but I won't be able to see it complete today, I'm going to sleep now.

Comment 19 Miro Hrončok 2024-03-14 09:10:43 UTC
If I change the problematic code to:

                    if klass in self._template_arg_classes:
                        self._template_arg_classes.remove(klass)
                    else:
                        repr(klass)

To force calling repr() even on Pythons without https://github.com/python/cpython/commit/217f47d6e5e56bca78b8556e910cd00890f6f84a I see that even the first commit which diverged the main cpython branch from 3.12 hangs.

In fact, the hang happens even with Python 3.12.2.

So even if we assume the recursive dataclass has impossibly slow repr, *that* does not seem to be a regression in Python 3.13.


I've also tried to print(repr(klass)) to see what is going on, but there are far too many and the log is unreadable.

----

I will report this problem to CPython, but without a shorter reproducer, it's unlikely to be solved.

Could you (Scott) then present my findings to upstream sip?


Proposal: I change the code in sip-build not to try-except the remove, even if it is a temporary workaround.

Comment 20 Miro Hrončok 2024-03-14 09:42:19 UTC
Python issue: https://github.com/python/cpython/issues/116792

Comment 21 Miro Hrončok 2024-03-14 09:46:00 UTC
sip6 downstream workaround: https://src.fedoraproject.org/rpms/sip6/pull-request/2

Comment 22 Miro Hrončok 2024-03-14 11:51:34 UTC
*** Bug 2247290 has been marked as a duplicate of this bug. ***

Comment 23 Scott Talbert 2024-03-14 13:01:26 UTC
> I will report this problem to CPython, but without a shorter reproducer,
> it's unlikely to be solved.
> 
> Could you (Scott) then present my findings to upstream sip?

Yes, I'll take this upstream to sip.

Also, I'll try to come up with a reproducer for the repr() OOM.

Went ahead and merged your PR.  Thanks for all your work on this and sorry I was slow to finally get the original bisection going.  :-)


Note You need to log in before you can comment on or make changes to this bug.