Bug 2255715 - Review Request: python-torchtext - Data loaders and abstractions for language processing, powered by PyTorch
Summary: Review Request: python-torchtext - Data loaders and abstractions for language...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: Package Review
Version: rawhide
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Tomas Tomecek
QA Contact: Fedora Extras Quality Assurance
URL: https://github.com/pytorch/text
Whiteboard:
Depends On: 2271900
Blocks: ML-SIG
TreeView+ depends on / blocked
 
Reported: 2023-12-23 16:19 UTC by Tom Rix
Modified: 2024-04-15 19:31 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2024-04-15 19:31:04 UTC
Type: ---
Embargoed:
ttomecek: fedora-review+


Attachments (Terms of Use)
The .spec file difference from Copr build 6863271 to 7223432 (2.37 KB, patch)
2024-03-27 23:47 UTC, Fedora Review Service
no flags Details | Diff

Description Tom Rix 2023-12-23 16:19:31 UTC
Spec URL: https://trix.fedorapeople.org/python-torchtext.spec
SRPM URL: https://trix.fedorapeople.org/python-torchtext-0.16.0-1.fc40.src.rpm

torchtext is a complementary python package to torch, providing torch                                                             
with text processing functionality needed for AI.

Reproducible: Always

Comment 1 Onuralp Sezer 2024-01-04 14:29:19 UTC
[fedora-review-service-build]

Comment 2 Fedora Review Service 2024-01-04 14:42:25 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/6863271
(succeeded)

Review template:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2255715-python-torchtext/fedora-rawhide-x86_64/06863271-python-torchtext/fedora-review/review.txt

Please take a look if any issues were found.


---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 3 Tomas Tomecek 2024-03-27 15:48:36 UTC
Onuralp, will you have the time to finish this review?

Comment 4 Tomas Tomecek 2024-03-27 15:58:32 UTC
Unfortunately I'm unable to rebuild the package locally because python3-torch-devel can't be found:
```
ERROR: Command failed: 
 # /usr/bin/systemd-nspawn -q -M 17759e3425854142aae809bfae5c7148 -D /var/lib/mock/fedora-rawhide-x86_64-bootstrap/root -a --capability=cap_ipc_lock --bind=/tmp/mock-resolv.g9ekckwu:/etc/resolv.conf --console=pipe --setenv=TERM=vt100 --setenv=SHELL=/bin/bash --setenv=HOME=/var/lib/mock/fedora-rawhide-x86_64/root/installation-homedir --setenv=HOSTNAME=mock --setenv=PATH=/usr/bin:/bin:/usr/sbin:/sbin '--setenv=PROMPT_COMMAND=printf "\033]0;<mock-chroot>\007"' '--setenv=PS1=<mock-chroot> \s-\v\$ ' --setenv=LANG=C.UTF-8 --setenv=LC_MESSAGES=C.UTF-8 --resolv-conf=off /usr/bin/dnf5 builddep --installroot /var/lib/mock/fedora-rawhide-x86_64/root/ --releasever 41 /var/lib/mock/fedora-rawhide-x86_64/root//builddir/build/SRPMS/python-torchtext-0.16.0-1.fc41.src.rpm --setopt=deltarpm=False --setopt=allow_vendor_change=yes --allowerasing --setopt=tsflags=nocontexts
Updating and loading repositories:
 fedora                                 100% |  21.4 KiB/s |  12.5 KiB |  00m01s
 fedora                                 100% |   4.8 MiB/s |   6.9 MiB |  00m01s
Repositories loaded.
Failed to resolve the transaction:
No match for argument: python3-torch-devel
```

python3-torch works just fine, but -devel is missing
```
$ podman run --rm -ti fedora:41 bash

[root@03e7b73d2e4d /]# dnf install python3-torch-devel
Fedora rawhide openh264 (From Cisco) - x86_64                                292  B/s | 123  B     00:00
Fedora - Rawhide - Developmental packages for the next Fedora release        4.0 MB/s |  20 MB     00:05
No match for argument: python3-torch-devel
Error: Unable to find a match: python3-torch-devel

[root@03e7b73d2e4d /]# dnf install python3-torch
Last metadata expiration check: 0:00:10 ago on Wed Mar 27 15:55:31 2024.
Dependencies resolved.
============================================================
 Package                                                    
============================================================
Installing:
 python3-torch                                              
```

Comment 5 Tom Rix 2024-03-27 16:32:25 UTC
Ok.  I will update to the latest and repost.
Thanks!

Comment 6 Tom Rix 2024-03-27 23:28:24 UTC
Spec URL: https://trix.fedorapeople.org/python-torchtext.spec
SRPM URL: https://trix.fedorapeople.org/python-torchtext-0.17.1-1.fc41.src.rpm

Here's the update, it can optionally use the python-expecttest package

Comment 7 Fedora Review Service 2024-03-27 23:47:17 UTC
Created attachment 2023942 [details]
The .spec file difference from Copr build 6863271 to 7223432

Comment 8 Fedora Review Service 2024-03-27 23:47:20 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/7223432
(succeeded)

Review template:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2255715-python-torchtext/fedora-rawhide-x86_64/07223432-python-torchtext/fedora-review/review.txt

Please take a look if any issues were found.


---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 9 Tomas Tomecek 2024-03-28 11:07:01 UTC
Taking this review so Tom can continue in his work.


Package Review
==============

Legend:
[x] = Pass, [!] = Fail, [-] = Not applicable, [?] = Not evaluated
[ ] = Manual review needed



===== MUST items =====

C/C++:
[x]: Provides: bundled(gnulib) in place as required.
     Note: Sources not installed
[x]: Package does not contain kernel modules.
[x]: Development (unversioned) .so files in -devel subpackage, if present.
     Note: Unversioned so-files in private %_libdir subdirectory (see
     attachment). Verify they are not in ld path.

     Thank you for patching cmake to version .so files.

     I had to read guidelines for this: https://docs.fedoraproject.org/en-US/packaging-guidelines/#_devel_packages

     and all is good, unversioned .so files are common in python's sitelib
[x]: If your application is a C or C++ application you must list a
     BuildRequires against gcc, gcc-c++ or clang.
[x]: Header files in -devel subpackage, if present.
[x]: Package does not contain any libtool archives (.la)
[x]: Package contains no static executables.
[x]: Rpath absent or only used for internal libs.

Generic:
[x]: Package is licensed with an open-source compatible license and meets
     other legal requirements as defined in the legal section of Packaging
     Guidelines.
[x]: License field in the package spec file matches the actual license.
     Note: Checking patched sources after %prep for licenses. Licenses
     found: "Unknown or generated", "BSD 3-Clause License", "MIT License",
     "Apache License 2.0". 320 files have unknown license. Detailed output
     of licensecheck in /home/tt/t/torchtext/2255715-python-
     torchtext/licensecheck.txt
[x]: License file installed when any subpackage combination is installed.
[x]: If the package is under multiple licenses, the licensing breakdown
     must be documented in the spec.
[x]: Package requires other packages for directories it uses.
     Note: No known owner of /usr/lib64/python3.12/site-packages,
     /usr/lib64/python3.12
[x]: %build honors applicable compiler flags or justifies otherwise.
[x]: Package contains no bundled libraries without FPC exception.
[x]: Changelog in prescribed format.
[x]: Sources contain only permissible code or content.
[x]: Package contains desktop file if it is a GUI application.
[x]: Development files must be in a -devel package
[x]: Package uses nothing in %doc for runtime.
[x]: Package consistently uses macros (instead of hard-coded directory
     names).
[x]: Package is named according to the Package Naming Guidelines.
[x]: Package does not generate any conflict.
[x]: Package obeys FHS, except libexecdir and /usr/target.
[x]: If the package is a rename of another package, proper Obsoletes and
     Provides are present.
[x]: Requires correct, justified where necessary.
[x]: Spec file is legible and written in American English.
[x]: Package contains systemd file(s) if in need.
[x]: Useful -debuginfo package or justification otherwise.
[x]: Package is not known to require an ExcludeArch tag.
[x]: Package complies to the Packaging Guidelines
[x]: Package successfully compiles and builds into binary rpms on at least
     one supported primary architecture.
[x]: Package installs properly.
[x]: Rpmlint is run on all rpms the build produces.
     Note: No rpmlint messages.
[x]: If (and only if) the source package includes the text of the
     license(s) in its own file, then that file, containing the text of the
     license(s) for the package is included in %license.
[x]: The License field must be a valid SPDX expression.
[x]: Package must own all directories that it creates.
[x]: Package does not own files or directories owned by other packages.
[x]: Package uses either %{buildroot} or $RPM_BUILD_ROOT
[x]: Package does not run rm -rf %{buildroot} (or $RPM_BUILD_ROOT) at the
     beginning of %install.
[x]: Macros in Summary, %description expandable at SRPM build time.
[x]: Dist tag is present.
[x]: Package does not contain duplicates in %files.
[x]: Permissions on files are set properly.
[x]: Package must not depend on deprecated() packages.
[x]: Package use %makeinstall only when make install DESTDIR=... doesn't
     work.
[x]: Package is named using only allowed ASCII characters.
[x]: Package does not use a name that already exists.
[x]: Package is not relocatable.
[x]: Sources used to build the package match the upstream source, as
     provided in the spec URL.
[x]: Spec file name must match the spec package %{name}, in the format
     %{name}.spec.
[x]: File names are valid UTF-8.
[x]: Large documentation must go in a -doc subpackage. Large could be size
     (~1MB) or number of files.
     Note: Documentation size is 7045 bytes in 1 files.
[x]: Packages must not store files under /srv, /opt or /usr/local

Python:
[x]: Binary eggs must be removed in %prep
     Note: Cannot find any build in BUILD directory (--prebuilt option?)
[x]: Python eggs must not download any dependencies during the build
     process.
[x]: A package which is used by another package via an egg interface should
     provide egg info.
[x]: Package meets the Packaging Guidelines::Python
[x]: Package contains BR: python2-devel or python3-devel
[x]: Packages MUST NOT have dependencies (either build-time or runtime) on
     packages named with the unversioned python- prefix unless no properly
     versioned package exists. Dependencies on Python packages instead MUST
     use names beginning with python2- or python3- as appropriate.
[x]: Python packages must not contain %{pythonX_site(lib|arch)}/* in %files

===== SHOULD items =====

Generic:
[x]: If the source package does not include license text(s) as a separate
     file from upstream, the packager SHOULD query upstream to include it.
[x]: Final provides and requires are sane (see attachments).
[x]: Fully versioned dependency in subpackages if applicable.
     Note: No Requires: %{name}%{?_isa} = %{version}-%{release} in
     python3-torchtext
[x]: Package functions as described.
[x]: Latest version is packaged.
[x]: Package does not include license text files separate from upstream.
[x]: Patches link to upstream bugs/comments/lists or are otherwise
     justified.
[x]: Sources are verified with gpgverify first in %prep if upstream
     publishes signatures.
     Note: gpgverify is not used.
[?]: %check is present and all tests pass.

     I haven't tried to run the test suite with the packaged expecttest.

[x]: Packages should try to preserve timestamps of original installed
     files.
[x]: Reviewer should test that the package builds in mock.
[x]: Buildroot is not present
[x]: Package has no %clean section with rm -rf %{buildroot} (or
     $RPM_BUILD_ROOT)
[x]: No file requires outside of /etc, /bin, /sbin, /usr/bin, /usr/sbin.
[x]: Packager, Vendor, PreReq, Copyright tags should not be in spec file
[x]: Sources can be downloaded from URI in Source: tag
[x]: SourceX is a working URL.
[x]: Package should compile and build into binary rpms on all supported
     architectures.
[x]: Spec use %global instead of %define unless justified.

===== EXTRA items =====

Generic:
[x]: Rpmlint is run on all installed packages.
     Note: There are rpmlint messages (see attachment).
[x]: Large data in /usr/share should live in a noarch subpackage if package
     is arched.
[x]: Spec file according to URL is the same as in SRPM.


Rpmlint
-------
Checking: python3-torchtext-0.17.1-1.fc41.x86_64.rpm
          python-torchtext-debugsource-0.17.1-1.fc41.x86_64.rpm
          python-torchtext-0.17.1-1.fc41.src.rpm
========================== rpmlint session starts =========================
rpmlint: 2.5.0
configuration:
    /usr/lib/python3.12/site-packages/rpmlint/configdefaults.toml
    /etc/xdg/rpmlint/fedora-legacy-licenses.toml
    /etc/xdg/rpmlint/fedora-spdx-licenses.toml
    /etc/xdg/rpmlint/fedora.toml
    /etc/xdg/rpmlint/scoring.toml
    /etc/xdg/rpmlint/users-groups.toml
    /etc/xdg/rpmlint/warn-on-functions.toml
rpmlintrc: [PosixPath('/tmp/tmpm2lqbz18')]
checks: 32, packages: 3

========================== 3 packages and 0 specfiles checked; 0 errors, 0 warnings, 21 filtered, 0 badness; has taken 0.8 s =====================




Rpmlint (installed packages)
----------------------------
============================ rpmlint session starts ============================
rpmlint: 2.5.0
configuration:
    /usr/lib/python3.12/site-packages/rpmlint/configdefaults.toml
    /etc/xdg/rpmlint/fedora-legacy-licenses.toml
    /etc/xdg/rpmlint/fedora-spdx-licenses.toml
    /etc/xdg/rpmlint/fedora.toml
    /etc/xdg/rpmlint/scoring.toml
    /etc/xdg/rpmlint/users-groups.toml
    /etc/xdg/rpmlint/warn-on-functions.toml
checks: 32, packages: 2

python3-torchtext.x86_64: W: unused-direct-shlib-dependency /usr/lib64/python3.12/site-packages/torchtext/_torchtext.so libtorchtext.so.0.16
python3-torchtext.x86_64: W: unused-direct-shlib-dependency /usr/lib64/python3.12/site-packages/torchtext/_torchtext.so libtorch_python.so
python3-torchtext.x86_64: W: unused-direct-shlib-dependency /usr/lib64/python3.12/site-packages/torchtext/_torchtext.so libc10.so
python3-torchtext.x86_64: W: unused-direct-shlib-dependency /usr/lib64/python3.12/site-packages/torchtext/_torchtext.so libtorch_cpu.so
python3-torchtext.x86_64: E: unused-direct-shlib-dependency /usr/lib64/python3.12/site-packages/torchtext/lib/libtorchtext.so libc10.so
python3-torchtext.x86_64: E: unused-direct-shlib-dependency /usr/lib64/python3.12/site-packages/torchtext/lib/libtorchtext.so libtorch_cpu.so
python3-torchtext.x86_64: E: unused-direct-shlib-dependency /usr/lib64/python3.12/site-packages/torchtext/lib/libtorchtext.so.0.16 libc10.so
python3-torchtext.x86_64: E: unused-direct-shlib-dependency /usr/lib64/python3.12/site-packages/torchtext/lib/libtorchtext.so.0.16 libtorch_cpu.so
python3-torchtext.x86_64: W: undefined-non-weak-symbol /usr/lib64/python3.12/site-packages/torchtext/_torchtext.so _ZTIN9torchtext5RegexE	(/usr/lib64/python3.12/site-packages/torchtext/_torchtext.so)
python3-torchtext.x86_64: W: undefined-non-weak-symbol /usr/lib64/python3.12/site-packages/torchtext/_torchtext.so _ZTIN9torchtext14GPT2BPEEncoderE	(/usr/lib64/python3.12/site-packages/torchtext/_torchtext.so)

...hundreds more

these are explained in the specfile in the %build section, they are a false positive

 2 packages and 0 specfiles checked; 148 errors, 213 warnings, 17 filtered, 148 badness; has taken 0.6 s 



Unversioned so-files
--------------------
python3-torchtext: /usr/lib64/python3.12/site-packages/torchtext/_torchtext.so
python3-torchtext: /usr/lib64/python3.12/site-packages/torchtext/lib/libtorchtext.so

this is a common practice for python projects

Source checksums
----------------
https://github.com/pytorch/text/archive/refs/tags/v0.17.1.tar.gz#/text-0.17.1.tar.gz :
  CHECKSUM(SHA256) this package     : 1b21c1efb13072465bc11dbb7b80e8bdc3aca3cee9234242f57f0503f3db47f5
  CHECKSUM(SHA256) upstream package : 1b21c1efb13072465bc11dbb7b80e8bdc3aca3cee9234242f57f0503f3db47f5


Requires
--------
python3-torchtext (rpmlib, GLIBC filtered):
    libc.so.6()(64bit)
    libc10.so()(64bit)
    libdouble-conversion.so.3()(64bit)
    libgcc_s.so.1()(64bit)
    libgcc_s.so.1(GCC_3.0)(64bit)
    libgcc_s.so.1(GCC_3.3.1)(64bit)
    libre2.so.9()(64bit)
    libsentencepiece.so.0()(64bit)
    libsentencepiece_train.so.0()(64bit)
    libstdc++.so.6()(64bit)
    libstdc++.so.6(CXXABI_1.3)(64bit)
    libstdc++.so.6(CXXABI_1.3.11)(64bit)
    libstdc++.so.6(CXXABI_1.3.13)(64bit)
    libstdc++.so.6(CXXABI_1.3.15)(64bit)
    libstdc++.so.6(CXXABI_1.3.2)(64bit)
    libstdc++.so.6(CXXABI_1.3.3)(64bit)
    libstdc++.so.6(CXXABI_1.3.5)(64bit)
    libstdc++.so.6(CXXABI_1.3.9)(64bit)
    libtorch_cpu.so()(64bit)
    libtorch_python.so()(64bit)
    libtorchtext.so.0.16()(64bit)
    libutf8proc.so.2()(64bit)
    python(abi)
    python3.12dist(numpy)
    python3.12dist(requests)
    python3.12dist(torch)
    python3.12dist(torchdata)
    python3.12dist(tqdm)
    rtld(GNU_HASH)

python-torchtext-debugsource (rpmlib, GLIBC filtered):



Provides
--------
python3-torchtext:
    libtorchtext.so.0.16()(64bit)
    python-torchtext
    python3-torchtext
    python3-torchtext(x86-64)
    python3.12-torchtext
    python3.12dist(torchtext)
    python3dist(torchtext)

python-torchtext-debugsource:
    python-torchtext-debugsource
    python-torchtext-debugsource(x86-64)



Generated by fedora-review 0.10.0 (e79b66b) last change: 2023-07-24
Command line :/usr/bin/fedora-review -b 2255715
Buildroot used: fedora-rawhide-x86_64
Active plugins: Shell-api, Python, C/C++, Generic
Disabled plugins: Haskell, Java, R, fonts, PHP, SugarActivity, Ocaml, Perl
Disabled flags: EXARCH, EPEL6, EPEL7, DISTTAG, BATCH


Approved, very nice work, Tom!

Comment 10 Fedora Admin user for bugzilla script actions 2024-03-28 13:44:00 UTC
The Pagure repository was created at https://src.fedoraproject.org/rpms/python-torchtext


Note You need to log in before you can comment on or make changes to this bug.