Bug 2357508

Summary: ocrmypdf fails to build with Python 3.14: multiprocessing.Process now starts with forkserver method instead of fork, causing pickling error
Product: [Fedora] Fedora Reporter: Karolina Surma <ksurma>
Component: ocrmypdfAssignee: Elliott Sales de Andrade <quantum.analyst>
Status: CLOSED WORKSFORME QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: rawhideCC: dreua, extras-orphan, fti-bugs, ksurma, mhroncok, python-packagers-sig, quantum.analyst
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ocrmypdf-16.11.0-1.fc44, ocrmypdf-16.11.0-1.fc43 Doc Type: ---
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2025-09-25 11:04:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2322407, 2339432, 2339435    

Description Karolina Surma 2025-04-04 15:34:24 UTC
ocrmypdf fails to build with Python 3.14.0a6.

_________________________________ test_semfree _________________________________
[gw2] linux -- Python 3.14.0 /usr/bin/python3

resources = PosixPath('/builddir/build/BUILD/ocrmypdf-16.7.0-build/ocrmypdf-16.7.0/tests/resources')
outpdf = PosixPath('/tmp/pytest-of-mockbuild/pytest-0/popen-gw2/test_semfree0/out.pdf')

    @pytest.mark.skipif(not is_linux(), reason='semfree plugin only works on Linux')
    def test_semfree(resources, outpdf):
        exitcode = run_ocrmypdf_api(
            resources / 'multipage.pdf',
            outpdf,
            '--skip-text',
            '--skip-big',
            '2',
            '--plugin',
            'ocrmypdf.extra_plugins.semfree',
            '--plugin',
            'tests/plugins/tesseract_noop.py',
        )
>       assert exitcode in (ExitCode.ok, ExitCode.pdfa_conversion_failed)
E       assert <ExitCode.other_error: 15> in (<ExitCode.ok: 0>, <ExitCode.pdfa_conversion_failed: 10>)

tests/test_semfree.py:26: AssertionError
------------------------------ Captured log call -------------------------------
WARNING  ocrmypdf._pipeline:_pipeline.py:374 page too big, skipping OCR (81.0 MPixels > 2.0 MPixels --skip-big)
WARNING  ocrmypdf._pipeline:_pipeline.py:374 page too big, skipping OCR (2.0 MPixels > 2.0 MPixels --skip-big)
WARNING  ocrmypdf._metadata:_metadata.py:63 Some input metadata could not be copied because it is not permitted in PDF/A. You may wish to examine the output PDF's XMP metadata.
ERROR    ocrmypdf._pipelines._common:_common.py:296 An exception occurred while executing the pipeline
Traceback (most recent call last):
  File "/builddir/build/BUILD/ocrmypdf-16.7.0-build/BUILDROOT/usr/lib/python3.14/site-packages/ocrmypdf/_pipelines/_common.py", line 261, in cli_exception_handler
    return fn(options, plugin_manager)
  File "/builddir/build/BUILD/ocrmypdf-16.7.0-build/BUILDROOT/usr/lib/python3.14/site-packages/ocrmypdf/_pipelines/ocr.py", line 181, in _run_pipeline
    optimize_messages = exec_concurrent(context, executor)
  File "/builddir/build/BUILD/ocrmypdf-16.7.0-build/BUILDROOT/usr/lib/python3.14/site-packages/ocrmypdf/_pipelines/ocr.py", line 145, in exec_concurrent
    pdf, messages = postprocess(pdf, context, executor)
                    ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
  File "/builddir/build/BUILD/ocrmypdf-16.7.0-build/BUILDROOT/usr/lib/python3.14/site-packages/ocrmypdf/_pipelines/_common.py", line 460, in postprocess
    return optimize_pdf(pdf_out, context, executor)
  File "/builddir/build/BUILD/ocrmypdf-16.7.0-build/BUILDROOT/usr/lib/python3.14/site-packages/ocrmypdf/_pipeline.py", line 984, in optimize_pdf
    output_pdf, messages = context.plugin_manager.hook.optimize_pdf(
                           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        input_pdf=input_file,
        ^^^^^^^^^^^^^^^^^^^^^
    ...<3 lines>...
        linearize=should_linearize(input_file, context),
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3.14/site-packages/pluggy/_hooks.py", line 513, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.14/site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.14/site-packages/pluggy/_callers.py", line 139, in _multicall
    raise exception.with_traceback(exception.__traceback__)
  File "/usr/lib/python3.14/site-packages/pluggy/_callers.py", line 103, in _multicall
    res = hook_impl.function(*args)
  File "/builddir/build/BUILD/ocrmypdf-16.7.0-build/BUILDROOT/usr/lib/python3.14/site-packages/ocrmypdf/builtin_plugins/optimize.py", line 145, in optimize_pdf
    result_path = optimize(input_pdf, output_pdf, context, save_settings, executor)
  File "/builddir/build/BUILD/ocrmypdf-16.7.0-build/BUILDROOT/usr/lib/python3.14/site-packages/ocrmypdf/optimize.py", line 705, in optimize
    deflate_jpegs(pdf, root, options, executor)
    ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/builddir/build/BUILD/ocrmypdf-16.7.0-build/BUILDROOT/usr/lib/python3.14/site-packages/ocrmypdf/optimize.py", line 575, in deflate_jpegs
    executor(
    ~~~~~~~~^
        use_threads=True,  # We're sharing the pdf directly, must use threads
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<9 lines>...
        task_finished=finish,
        ^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/builddir/build/BUILD/ocrmypdf-16.7.0-build/BUILDROOT/usr/lib/python3.14/site-packages/ocrmypdf/_concurrent.py", line 78, in __call__
    self._execute(
    ~~~~~~~~~~~~~^
        use_threads=use_threads,
        ^^^^^^^^^^^^^^^^^^^^^^^^
    ...<5 lines>...
        task_finished=task_finished,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/builddir/build/BUILD/ocrmypdf-16.7.0-build/BUILDROOT/usr/lib/python3.14/site-packages/ocrmypdf/extra_plugins/semfree.py", line 157, in _execute
    process.start()
    ~~~~~~~~~~~~~^^
  File "/usr/lib64/python3.14/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
                  ~~~~~~~~~~~^^^^^^
  File "/usr/lib64/python3.14/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
  File "/usr/lib64/python3.14/multiprocessing/context.py", line 300, in _Popen
    return Popen(process_obj)
  File "/usr/lib64/python3.14/multiprocessing/popen_forkserver.py", line 35, in __init__
    super().__init__(process_obj)
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
  File "/usr/lib64/python3.14/multiprocessing/popen_fork.py", line 20, in __init__
    self._launch(process_obj)
    ~~~~~~~~~~~~^^^^^^^^^^^^^
  File "/usr/lib64/python3.14/multiprocessing/popen_forkserver.py", line 47, in _launch
    reduction.dump(process_obj, buf)
    ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.14/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^
TypeError: cannot pickle 'pikepdf._core.Pdf' object
when serializing tuple item 0
when serializing list item 0
when serializing tuple item 4
when serializing dict item '_args'
when serializing multiprocessing.context.Process state
when serializing multiprocessing.context.Process object

https://docs.python.org/3.14/whatsnew/3.14.html

The default start method changed from fork to forkserver on platforms other than macOS and Windows where it was already spawn.

If the threading incompatible fork method is required, you must explicitly request it via a context from multiprocessing.get_context() (preferred) or change the default via multiprocessing.set_start_method().

See forkserver restrictions for information and differences with the fork method and how this change may affect existing code with mutable global shared variables and/or shared objects that can not be automatically pickled.

For the build logs, see:
https://copr-be.cloud.fedoraproject.org/results/@python/python3.14/fedora-rawhide-x86_64/08848362-ocrmypdf/

For all our attempts to build ocrmypdf with Python 3.14, see:
https://copr.fedorainfracloud.org/coprs/g/python/python3.14/package/ocrmypdf/

Testing and mass rebuild of packages is happening in copr.
You can follow these instructions to test locally in mock if your package builds with Python 3.14:
https://copr.fedorainfracloud.org/coprs/g/python/python3.14/

Let us know here if you have any questions.

Python 3.14 is planned to be included in Fedora 43.
To make that update smoother, we're building Fedora packages with all pre-releases of Python 3.14.
A build failure prevents us from testing all dependent packages (transitive [Build]Requires),
so if this package is required a lot, it's important for us to get it fixed soon.

We'd appreciate help from the people who know this package best,
but if you don't want to work on this now, let us know so we can try to work around it on our side.

Comment 1 Karolina Surma 2025-06-11 15:54:38 UTC
*** Bug 2371757 has been marked as a duplicate of this bug. ***

Comment 2 Fedora Fails To Install 2025-06-20 19:52:12 UTC
Hello,

Please note that this comment was generated automatically by https://pagure.io/releng/blob/main/f/scripts/ftbfs-fti/follow-policy.py
If you feel that this output has mistakes, please open an issue at https://pagure.io/releng/

This package fails to install and maintainers are advised to take one of the following actions:

 - Fix this bug and close this bugzilla once the update makes it to the repository.
   (The same script that posted this comment will eventually close this bugzilla
   when the fixed package reaches the repository, so you don't have to worry about it.)

or

 - Move this bug to ASSIGNED if you plan on fixing this, but simply haven't done so yet.

or

 - Orphan the package if you no longer plan to maintain it.


If you do not take one of these actions, the process at https://docs.fedoraproject.org/en-US/fesco/Fails_to_build_from_source_Fails_to_install/#_package_removal_for_long_standing_ftbfs_and_fti_bugs will continue.
This package may be orphaned in 7+ weeks.
This is the first reminder (step 3) from the policy.

Don't hesitate to ask for help on https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/ if you are unsure how to fix this bug.

Comment 3 Fedora Fails To Install 2025-07-15 08:35:48 UTC
Hello,

Please note that this comment was generated automatically by https://pagure.io/releng/blob/main/f/scripts/ftbfs-fti/follow-policy.py
If you feel that this output has mistakes, please open an issue at https://pagure.io/releng/

This package fails to install and maintainers are advised to take one of the following actions:

 - Fix this bug and close this bugzilla once the update makes it to the repository.
   (The same script that posted this comment will eventually close this bugzilla
   when the fixed package reaches the repository, so you don't have to worry about it.)

or

 - Move this bug to ASSIGNED if you plan on fixing this, but simply haven't done so yet.

or

 - Orphan the package if you no longer plan to maintain it.


If you do not take one of these actions, the process at https://docs.fedoraproject.org/en-US/fesco/Fails_to_build_from_source_Fails_to_install/#_package_removal_for_long_standing_ftbfs_and_fti_bugs will continue.
This package may be orphaned in 4+ weeks.
This is the second reminder (step 4) from the policy.

Don't hesitate to ask for help on https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/ if you are unsure how to fix this bug.

Comment 4 Fedora Fails To Install 2025-08-12 12:18:51 UTC
This package has been orphaned.

You can pick it up at https://src.fedoraproject.org/rpms/ocrmypdf by clicking button "Take". If nobody picks it up, it will be retired and removed from a distribution.

Comment 5 Fedora Admin user for bugzilla script actions 2025-08-12 13:25:21 UTC
This package has changed maintainer in Fedora. Reassigning to the new maintainer of this component.

Comment 6 David Auer 2025-08-12 18:53:31 UTC
I think the latest version of ocrmapdf contains a workaround for this issue, has anyone tried building that?

Furthermore this package does not appear orphaned on src.fedoraproject.org. Either there has been a bug or it has been re-taken after orphaning? Feel free to add me as co-maintainer (dreua).

Comment 7 Fedora Fails To Install 2025-08-12 20:27:42 UTC
Hello,

Please note that this comment was generated automatically by https://pagure.io/releng/blob/main/f/scripts/ftbfs-fti/follow-policy.py
If you feel that this output has mistakes, please open an issue at https://pagure.io/releng/

This package fails to install and maintainers are advised to take one of the following actions:

 - Fix this bug and close this bugzilla once the update makes it to the repository.
   (The same script that posted this comment will eventually close this bugzilla
   when the fixed package reaches the repository, so you don't have to worry about it.)

or

 - Move this bug to ASSIGNED if you plan on fixing this, but simply haven't done so yet.

or

 - Orphan the package if you no longer plan to maintain it.


If you do not take one of these actions, the process at https://docs.fedoraproject.org/en-US/fesco/Fails_to_build_from_source_Fails_to_install/#_package_removal_for_long_standing_ftbfs_and_fti_bugs will continue.
This package may be orphaned in 7+ weeks.
This is the first reminder (step 3) from the policy.

Don't hesitate to ask for help on https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/ if you are unsure how to fix this bug.

Comment 8 Fedora Admin user for bugzilla script actions 2025-08-13 01:19:16 UTC
This package has changed maintainer in Fedora. Reassigning to the new maintainer of this component.

Comment 9 Elliott Sales de Andrade 2025-08-13 05:07:49 UTC
(In reply to David Auer from comment #6)
> I think the latest version of ocrmapdf contains a workaround for this issue,
> has anyone tried building that?
> 
> Furthermore this package does not appear orphaned on src.fedoraproject.org.
> Either there has been a bug or it has been re-taken after orphaning? Feel
> free to add me as co-maintainer (dreua).

Thanks for the hint. I will backport the changes to fix the build. Unfortunately, the latest version requires streamlit for the webservice, which isn't packaged, so I can't update to it yet (or may have to drop the webservice.)

Comment 10 Elliott Sales de Andrade 2025-08-13 07:29:44 UTC
Actually, 16.10.4 also doesn't pass; I've opened an issue upstream about it.

Comment 11 Fedora Fails To Install 2025-09-16 09:58:37 UTC
Hello,

Please note that this comment was generated automatically by https://pagure.io/releng/blob/main/f/scripts/ftbfs-fti/follow-policy.py
If you feel that this output has mistakes, please open an issue at https://pagure.io/releng/

All subpackages of a package against which this bug was filled are now installable or removed from Fedora 44.

Thanks for taking care of it!

Comment 12 Fedora Fails To Install 2025-09-25 11:04:43 UTC
Hello,

Please note that this comment was generated automatically by https://pagure.io/releng/blob/main/f/scripts/ftbfs-fti/follow-policy.py
If you feel that this output has mistakes, please open an issue at https://pagure.io/releng/

All subpackages of a package against which this bug was filled are now installable or removed from Fedora 43.

Thanks for taking care of it!