Bug 2097476 - Review Request: python-pyarrow - Python library for Apache Arrow
Summary: Review Request: python-pyarrow - Python library for Apache Arrow
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: Package Review
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Nobody's working on this, feel free to take it
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 2070345
TreeView+ depends on / blocked
 
Reported: 2022-06-15 18:33 UTC by Major Hayden 🤠
Modified: 2022-08-10 17:44 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 17:44:38 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Major Hayden 🤠 2022-06-15 18:33:22 UTC
Spec URL: https://download.copr.fedorainfracloud.org/results/mhayden/pyarrow/fedora-rawhide-x86_64/04538205-python-pyarrow/python-pyarrow.spec
SRPM URL: https://download.copr.fedorainfracloud.org/results/mhayden/pyarrow/fedora-rawhide-x86_64/04538205-python-pyarrow/python-pyarrow-8.0.0-1.fc37.src.rpm
Description: Python library for Apache Arrow
Fedora Account System Username: mhayden

I've gotta admit, this is one of the most difficult python packages I've had to do so far. 🥵 I will gladly take any advice on how to improve this one.

COPR builds on multiple architectures: https://copr.fedorainfracloud.org/coprs/mhayden/pyarrow/build/4538205/

This also unblocks the update from BZ 2070345 since python-google-cloud-bigquery requires pyarrow now.

Comment 2 Ben Beasley 2022-08-04 21:31:54 UTC
I was starting to review this, and I wrote this:

- The PyPI source archive lacks the license text because that is at the top                                             level of the git repository, but the Apache license requires the text to be
  distributed.

  Consider filing an upstream issue about the missing license file. You could
  add the license files as additional sources:

    Source1: https://github.com/apache/arrow/raw/apache-arrow-%{version}/LICENSE.txt
    Source2: https://github.com/apache/arrow/raw/apache-arrow-%{version}/NOTICE.txt

  Or, you could use the full GitHub archive as the source,

    Source0: https://github.com/apache/arrow/archive/apache-arrow-%{version}/arrow-apache-arrow-%{version}.tar.gz

  and then do something like this in %prep:

    # Remove non-Python sources:
    find . -mindepth 1 -maxdepth 1 -type d ! -name python -print -exec rm -rf

…but then I considered that I had just recommended using the same source archive as libarrow[1]. Perhaps it would be better to add python3-pyarrow as an additional subpackage in libarrow instead, rather than managing it as a separate package.

[1] https://src.fedoraproject.org/rpms/libarrow

Comment 3 Major Hayden 🤠 2022-08-10 17:44:38 UTC
Thanks, Ben. Luckily the libarrow maintainer was able to take some of what I proposed and add it to the main libarrow pkg. 🎉


Note You need to log in before you can comment on or make changes to this bug.