Bug 1984419 - Review Request: python-charset-normalizer - The Real First Universal Charset Detector
Summary: Review Request: python-charset-normalizer - The Real First Universal Charset ...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: Package Review
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Miro Hrončok
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1981856
TreeView+ depends on / blocked
 
Reported: 2021-07-21 12:01 UTC by Lumír Balhar
Modified: 2021-07-23 14:31 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2021-07-23 14:31:08 UTC
Type: ---
Embargoed:
mhroncok: fedora-review+


Attachments (Terms of Use)

Description Lumír Balhar 2021-07-21 12:01:18 UTC
Spec URL: https://lbalhar.fedorapeople.org/python-charset-normalizer.spec
SRPM URL: https://lbalhar.fedorapeople.org/python-charset-normalizer-2.0.3-1.fc34.src.rpm
Description: A library that helps you read text from an unknown charset encoding.
Motivated by chardet, I'm trying to resolve the issue by taking
a new approach. All IANA character set names for which the Python core
library provides codecs are supported.
Fedora Account System Username: lbalhar

This is a new dependency of python-requests. I'm building this new package, python-requests and all dependant packages in https://copr.fedorainfracloud.org/coprs/lbalhar/requests/builds/

Comment 1 Miro Hrončok 2021-07-21 13:42:02 UTC
Spec sanity:

> BuildRequires:  python3dist(pytest-cov)

Please don't use pytest-cov in %check if not hideously complicated to stop.

See https://docs.fedoraproject.org/en-US/packaging-guidelines/Python/#_linters

`rm setup.cfg` in %prep should do for now, a more complex sed is more future proof.

We would need to remove it from RHEL 10 anyway, so better do it straight away.



> Motivated by chardet, I'm trying to resolve the issue by taking a new approach.

I would not use "I" in package descriptions, it sounds weird.





Looks good otherwise, straight and simple specfile, thanks.

Comment 2 Lumír Balhar 2021-07-22 05:48:02 UTC
(In reply to Miro Hrončok from comment #1)
> `rm setup.cfg` in %prep should do for now, a more complex sed is more future
> proof.

Removing the file seemed too drastic to me and I was afraid that an empty section would cause problems but it seems it works fine. Fixed.

> I would not use "I" in package descriptions, it sounds weird.

Fixed.

Comment 3 Miro Hrončok 2021-07-22 08:21:14 UTC
Ack. Result of fedora-review (based on the previous versin, but should not differ now).

Package Review
==============

Legend:
[x] = Pass, [!] = Fail, [-] = Not applicable, [?] = Not evaluated


python3-charset-normalizer.noarch: W: wrong-file-end-of-line-encoding /usr/share/doc/python3-charset-normalizer/README.md

Please, convert the file to Linux line encoding, preferably also suggest this in upstream.



===== MUST items =====

Generic:
[x]: Package is licensed with an open-source compatible license and meets
     other legal requirements as defined in the legal section of Packaging
     Guidelines.
[x]: License field in the package spec file matches the actual license.
     Note: Checking patched sources after %prep for licenses. Licenses
     found: "Unknown or generated", "Expat License", "*No copyright* Expat
     License".
[x]: Package contains no bundled libraries without FPC exception.
[x]: Changelog in prescribed format.
[x]: Sources contain only permissible code or content.
[-]: Package contains desktop file if it is a GUI application.
[-]: Development files must be in a -devel package
[x]: Package uses nothing in %doc for runtime.
[x]: Package consistently uses macros (instead of hard-coded directory
     names).
[x]: Package is named according to the Package Naming Guidelines.
[x]: Package does not generate any conflict.
[x]: Package obeys FHS, except libexecdir and /usr/target.
[-]: If the package is a rename of another package, proper Obsoletes and
     Provides are present.
[x]: Requires correct, justified where necessary.
[x]: Spec file is legible and written in American English.
[-]: Package contains systemd file(s) if in need.
[x]: Package is not known to require an ExcludeArch tag.
[-]: Large documentation must go in a -doc subpackage. Large could be size
     (~1MB) or number of files.
     Note: Documentation size is 20480 bytes in 1 files.
[x]: Package complies to the Packaging Guidelines
[x]: Package successfully compiles and builds into binary rpms on at least
     one supported primary architecture.
[x]: Package installs properly.
[x]: Rpmlint is run on all rpms the build produces.
     Note: There are rpmlint messages (see attachment).
[x]: If (and only if) the source package includes the text of the
     license(s) in its own file, then that file, containing the text of the
     license(s) for the package is included in %license.
[x]: Package requires other packages for directories it uses.
[x]: Package must own all directories that it creates.
[x]: Package does not own files or directories owned by other packages.
[x]: Package uses either %{buildroot} or $RPM_BUILD_ROOT
[x]: Package does not run rm -rf %{buildroot} (or $RPM_BUILD_ROOT) at the
     beginning of %install.
[x]: Macros in Summary, %description expandable at SRPM build time.
[x]: Dist tag is present.
[x]: Package does not contain duplicates in %files.
[x]: Permissions on files are set properly.
[x]: Package must not depend on deprecated() packages.
[x]: Package use %makeinstall only when make install DESTDIR=... doesn't
     work.
[x]: Package is named using only allowed ASCII characters.
[x]: Package does not use a name that already exists.
[x]: Package is not relocatable.
[x]: Sources used to build the package match the upstream source, as
     provided in the spec URL.
[x]: Spec file name must match the spec package %{name}, in the format
     %{name}.spec.
[x]: File names are valid UTF-8.
[x]: Packages must not store files under /srv, /opt or /usr/local

Python:
[x]: Python eggs must not download any dependencies during the build
     process.
[x]: A package which is used by another package via an egg interface should
     provide egg info.
[?]: Package meets the Packaging Guidelines::Python -- will followup
[x]: Package contains BR: python2-devel or python3-devel
[x]: Packages MUST NOT have dependencies (either build-time or runtime) on
     packages named with the unversioned python- prefix unless no properly
     versioned package exists. Dependencies on Python packages instead MUST
     use names beginning with python2- or python3- as appropriate.
[x]: Python packages must not contain %{pythonX_site(lib|arch)}/* in %files
[x]: Binary eggs must be removed in %prep

===== SHOULD items =====

Generic:
[-]: If the source package does not include license text(s) as a separate
     file from upstream, the packager SHOULD query upstream to include it.
[x]: Final provides and requires are sane (see attachments).
[?]: Package functions as described.
[?]: Latest version is packaged.
[x]: Package does not include license text files separate from upstream.
[-]: Sources are verified with gpgverify first in %prep if upstream
     publishes signatures.
[-]: Description and summary sections in the package spec file contains
     translations for supported Non-English languages, if available.
[?]: Package should compile and build into binary rpms on all supported
     architectures.
[x]: %check is present and all tests pass.
[?]: Packages should try to preserve timestamps of original installed
     files.
[x]: Reviewer should test that the package builds in mock.
[x]: Buildroot is not present
[x]: Package has no %clean section with rm -rf %{buildroot} (or
     $RPM_BUILD_ROOT)
[x]: No file requires outside of /etc, /bin, /sbin, /usr/bin, /usr/sbin.
[x]: Packager, Vendor, PreReq, Copyright tags should not be in spec file
[x]: Sources can be downloaded from URI in Source: tag
[x]: SourceX is a working URL.
[x]: Spec use %global instead of %define unless justified.

===== EXTRA items =====

Generic:
[x]: Rpmlint is run on all installed packages.
     Note: There are rpmlint messages (see attachment).
[x]: Spec file according to URL is the same as in SRPM.


Rpmlint
-------
Checking: python3-charset-normalizer-2.0.3-1.fc35.noarch.rpm
          python-charset-normalizer-2.0.3-1.fc35.src.rpm
python3-charset-normalizer.noarch: W: spelling-error %description -l en_US chardet -> charted, charade, chard
python3-charset-normalizer.noarch: W: wrong-file-end-of-line-encoding /usr/share/doc/python3-charset-normalizer/README.md
python3-charset-normalizer.noarch: W: no-manual-page-for-binary normalizer
python-charset-normalizer.src: W: spelling-error %description -l en_US chardet -> charted, charade, chard
2 packages and 0 specfiles checked; 0 errors, 4 warnings.



Source checksums
----------------
https://github.com/ousret/charset_normalizer/archive/refs/tags/2.0.3.tar.gz :
  CHECKSUM(SHA256) this package     : 05b8a936fda245b2b68f20cd65a0b15ed7b70d7c268dfade15d2294c63efdd51
  CHECKSUM(SHA256) upstream package : 05b8a936fda245b2b68f20cd65a0b15ed7b70d7c268dfade15d2294c63efdd51


Requires
--------
python3-charset-normalizer (rpmlib, GLIBC filtered):
    /usr/bin/python3
    python(abi)



Provides
--------
python3-charset-normalizer:
    python-charset-normalizer
    python3-charset-normalizer
    python3.10-charset-normalizer
    python3.10dist(charset-normalizer)
    python3dist(charset-normalizer)



Generated by fedora-review 0.7.0 (fed5495) last change: 2019-03-17
Command line :try-fedora-review -b 1984419 -m fedora-rawhide-x86_64 --mock-options=--enablerepo=local
Buildroot used: fedora-rawhide-x86_64
Active plugins: Generic, Python, Shell-api
Disabled plugins: C/C++, R, Haskell, Java, Ruby, fonts, SugarActivity, PHP, Ocaml, Perl
Disabled flags: EPEL6, EPEL7, DISTTAG, BATCH, EXARCH

Comment 4 Miro Hrončok 2021-07-22 08:34:07 UTC
New Python packaging guidelines:

Open questions:

[?] Packages that primarily provide applications, services or any kind of executables SHOULD be named according to the general Fedora naming guidelines (e.g. ansible).

 I guess this is not primarily an application despite having one, right? Would users try to dnf install normalizer?



The rest is fine:

[x] Every package that uses Python (at runtime and/or build time) and/or installs Python modules MUST explicitly include BuildRequires: python3-devel in its .spec file, even if Python is not actually invoked during build time.

[-] If the package uses an alternate Python interpreter instead of python3 (e.g. pypy, jython, python2.7), it MAY instead require the corresponding *-devel package.

[x] The following macros MUST be used where applicable...

[x] Fedora packages MUST NOT depend on other versions of the CPython interpreter than the current python3.

[-] The character + in names of built (i.e. non-SRPM) packages that include .dist-info or .egg-info directories is reserved for Extras and MUST NOT be used for any other purpose.

[x] A built (i.e. non-SRPM) package for a Python library MUST be named with the prefix python3-. A source package containing primarily a Python library MUST be named with the prefix python-.

[x] The Fedora package’s name SHOULD contain the Canonical project name. If possible, the project name SHOULD be the same as the name of the main importable module, with underscores (_) replaced by dashes (-).

[-] If the importable module name and the project name do not match, users frequently end up confused. In this case, packagers SHOULD ensure that upstream is aware of the problem and (especially for new packages where renaming is feasible) strive to get the package renamed. The Python SIG is available for assistance.

[x] Packages MUST include the source file (*.py) AND the bytecode cache (*.pyc) for each pure-Python importable module. The source files MUST be included in the same package as the bytecode cache.

[-] Scripts that are not importable (typically ones in %{_bindir} or %{_libexecdir}) SHOULD NOT be byte-compiled.

[x] Each Python package MUST include Package Distribution Metadata conforming to PyPA specifications (specifically, Recording installed distributons).

[x] The metadata SHOULD be included in the same subpackage as the main importable module, if there is one.

[x] Packages MUST NOT own shared directories owned by Python itself, such as the top-level __pycache__ directories (%{python3_sitelib}/__pycache__, %{python3_sitearch}/__pycache__).

[x] Packagers SHOULD NOT simply glob everything under a shared directory.

[x] Every Python package in Fedora SHOULD also be available on the Python Package Index (PyPI).

[x] The command pip install PROJECTNAME MUST install the same package (possibly in a different version), install nothing, or fail with a reasonable error message.

[x] For any module intended to be used in Python 3 with import MODNAME, the package that includes it SHOULD provide python3-MODNAME, with underscores (_) replaced by dashes (-).

[x] For any FOO, a package that provides python3-FOO SHOULD use %py_provides or an automatic generator to also provide python-FOO and python3.X-FOO, where X is the minor version of the interpreter.

[x] Every Python package MUST provide python3dist(DISTNAME) and python3.Xdist(DISTNAME), where X is the minor version of the interpreter and DISTNAME is the Canonical project name corresponding to the Dist-info metadata. For example, python3-django would provide python3dist(django) and python3.9dist(django).

[x] As mentioned above, each Python package MUST explicitly BuildRequire python3-devel.

[x] Packages MUST NOT have dependencies (either build-time or runtime) with the unversioned prefix python- if the corresponding python3- dependency can be used instead.

[x] Packages SHOULD NOT have explicit dependencies (either build-time or runtime) with a minor-version prefix such as python3.8- or python3.8dist(. Such dependencies SHOULD instead be automatically generated or a macro should be used to get the version.

[x] Packages SHOULD NOT have an explicit runtime dependency on python3.

[x] Packages MUST use the automatic Python run-time dependency generator.

[x] Packages SHOULD use the opt-in build-dependency generator if possible.

[x] The packager MUST inspect the generated requires for correctness. All dependencies MUST be resolvable within the targeted Fedora version.

[-] Any necessary changes MUST be done by patches or modifying the source (e.g. with sed), rather than disabling the generator. The resulting change SHOULD be offered to upstream. As an exception, filtering MAY be used for temporary workarounds and bootstrapping.

[x] Dependencies covered by the generators SHOULD NOT be repeated in the .spec file. (For example, if the generator finds a requests dependency, then Requires: python3-requests is redundant.)

[x] Python packages SHOULD have Provides for all extras the upstream project specifies, except...

 The [unicode_backport] extra not possible, python-unicodedata2 not in Fedora.

[-] A package that provides a Python extra MUST provide python3dist(DISTNAME[EXTRA]) and python3.Xdist(DISTNAME[EXTRA]), where X is the minor version of the interpreter, DISTNAME is the Canonical project name, and EXTRA is the name of a single extra. For example, python3.9dist(requests[security]). These requirements SHOULD be generated using the automatic dependency generator.

[-] A package that provides a Python extra MUST require the extra’s main package with exact NEVR.

[-] A subpackage that primarily provides one Python extra SHOULD be named by appending + and the extra name to the main package name. For example, python3-requests+security.

[-] If an existing extra is removed from an upstream project, the Fedora maintainer SHOULD try to convince upstream to re-introduce it (with an empty list of dependencies). If that fails, the extra SHOULD be Obsoleted from either the main package or another extras subpackage.

[x] Shebang lines to invoke Python MUST use %{python3} as the interpreter.

  The shebang is #! /usr/bin/python3 -s

[x] Shebang lines to invoke Python SHOULD be #!%{python3} -%{py3_shebang_flags} and they MAY include extra flags.

  The shebang is #! /usr/bin/python3 -s

[-] If the default flags from the %{py3_shebang_flags} macro are not desirable, packages SHOULD explicitly redefine the macro to remove them.

[-] Every executable TOOL for which the current version of Python matters SHOULD also be invokable by python3 -m TOOL.
    If the software doesn’t provide this functionality, packagers SHOULD ask the upstream to add it.

[-] Tightening the what-can-be-packaged..adoc#_pregenerated_code, packages MUST NOT use files pre-generated by Cython. These MUST be deleted in %prep and regenerated during the build.

[x] If a test suite exists upstream, it SHOULD be run in the %check section. If that is not possible with reasonable effort, at least a basic smoke test (such as importing the packaged module) MUST be run in %check.

[-] You MAY exclude specific failing tests. You MUST NOT disable the entire testsuite or ignore its result to solve a build failure.

[-] As an exception, you MAY disable tests with an appropriate %if conditional (e.g. bcond) when bootstrapping.

[x] In %check, packages SHOULD NOT run “linters”: code style checkers, test coverage checkers and other tools that check code quality rather than functionality.

[-] However, packages SHOULD NOT use an archive that omits test suites, licenses and/or documentation present in other source archives.

Comment 5 Miro Hrončok 2021-07-22 08:35:16 UTC
tl;dr:


1) Fix end of lines of README.
2) Consider providing "normalizer".



When you do (1), the package is APPROVED.

Comment 6 Miro Hrončok 2021-07-22 08:37:26 UTC
One extra thing I figured after the review:

The package currently has no runtime requires. But it might have some in the future. And since you run tests in %check, I recommend using %pyproject_buildrequires with -r (for runtime). That way, missing runtime requirements  won't fail the tests in weird ways in the future and you'll also have a better build failure in case the new dependency is not available in Fedora yet.

Comment 7 Lumír Balhar 2021-07-22 10:45:53 UTC
>  I guess this is not primarily an application despite having one, right?
> Would users try to dnf install normalizer?

Your guess is correct. Because this package is a replacement for chardet, which also contains kinda hidden application, I've decided to package it in the same way.

> 1) Fix end of lines of README.

Done in spec and reported upstream: https://github.com/Ousret/charset_normalizer/issues/66

> The package currently has no runtime requires. But it might have some in the
> future. And since you run tests in %check, I recommend using
> %pyproject_buildrequires with -r (for runtime). That way, missing runtime
> requirements  won't fail the tests in weird ways in the future and you'll
> also have a better build failure in case the new dependency is not available
> in Fedora yet.

Fixed.

Thanks for the review!

Comment 8 Gwyn Ciesla 2021-07-22 13:54:14 UTC
(fedscm-admin):  The Pagure repository was created at https://src.fedoraproject.org/rpms/python-charset-normalizer

Comment 9 Lumír Balhar 2021-07-23 14:31:08 UTC
koschei and release monitoring activated, python-sig is a co-maintainer

https://bodhi.fedoraproject.org/updates/FEDORA-2021-f2524d1afc


Note You need to log in before you can comment on or make changes to this bug.