Bug 2107921

Summary: python3-protobuf not built with support for C++ protobuf Python extension
Product: [Fedora] Fedora Reporter: Endre Bjørsvik <endrebjorsvik>
Component: protobufAssignee: Orion Poplawski <orion>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 35CC: adrian, code, igor.raits, mizdebsk, orion, sander, shamardin
Target Milestone: ---Keywords: Improvement, Performance
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: protobuf-3.19.4-6.fc36 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-08-18 02:03:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Endre Bjørsvik 2022-07-17 15:53:01 UTC
Description of problem:
The google.protobuf Python module was written with optional support for a Python extension that wraps the C++ implementation of protobuf. The serialization/deserialization performance improvements of this C++ implementation when used from Python is on the order of 12-26x according to my very unscientific tests on a development laptop when using large Protobuf messages (eg. 1M repeated values in a message). According to the protobuf git repo[1], the optional support is built using:

   python3 setup.py build --cpp_implementation

This generates some compiled shared objects (*.so) in google.protobuf.pyext and .internal. The C++ implementation is automatically selected if the binaries are present, and it falls back on the slow Python implementation if not. The desired implementation can be forced using the `PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION` environment variable (takes either `cpp` or `python`).

The python3-protobuf package on F35 and F36 is built using a simple %py3_build, meaning:

   python3 setup.py build

This means that the C++ implementation wrapper is not built, and everything runs with the slow Python-based protobuf implementation.

[1]: https://github.com/protocolbuffers/protobuf/blob/main/python/README.md#installation


Version-Release number of selected component (if applicable):
- F35: 3.14.0-6.fc35
- F36: 3.19.4-2.fc36


How reproducible:
Always


Steps to Reproduce:
1. Install python3-protobuf
2. Run `PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp python3` to open Python interpreter.
3. Import a module that may use the C++ implementation: `from google.protobuf import descriptor`


Actual results:
Python throws the exception:
```
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.10/site-packages/google/protobuf/descriptor.py", line 48, in <module>
    from google.protobuf.pyext import _message
ImportError: cannot import name '_message' from 'google.protobuf.pyext' (/usr/lib/python3.10/site-packages/google/protobuf/pyext/__init__.py)
```


Expected results:
A successful import where google.protobuf wraps the C++ impementation of protobuf.


Additional info:

Comment 1 Orion Poplawski 2022-07-18 04:25:16 UTC
I've started to take a look at this and have managed to build it.  However, it does introduce the wrinkle that it is tied much more closely to Python internals and as such does not build with Python 3.11 (https://github.com/protobuf-c/protobuf-c/issues/515).  I'm working on making the build conditional so we can disable it when new Python versions break it.  I'll try to get some PRs filed shortly.

Comment 2 Orion Poplawski 2022-07-18 13:09:30 UTC
Filed PR: https://src.fedoraproject.org/rpms/protobuf/pull-request/12

Comment 3 Fedora Update System 2022-08-16 19:21:39 UTC
FEDORA-2022-6cafc93074 has been submitted as an update to Fedora 36. https://bodhi.fedoraproject.org/updates/FEDORA-2022-6cafc93074

Comment 4 Fedora Update System 2022-08-17 02:03:18 UTC
FEDORA-2022-6cafc93074 has been pushed to the Fedora 36 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2022-6cafc93074`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-6cafc93074

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 5 Fedora Update System 2022-08-18 02:03:58 UTC
FEDORA-2022-6cafc93074 has been pushed to the Fedora 36 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 6 Endre Bjørsvik 2022-08-18 07:44:38 UTC
Great work! The new package works flawlessly in my application on F36. The speedup on 1 million data points in one Protobuf message sent from Python to Python through gRPC is approximately 14x (from 3.3 seconds to 0.23 seconds). :-)