Bug 2028895 - Review Request: rust-tree-sitter - Rust bindings to the Tree-sitter parsing library
Summary: Review Request: rust-tree-sitter - Rust bindings to the Tree-sitter parsing l...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: Package Review
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Fabio Valentini
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 2028896 2028897 2028899 2028900
TreeView+ depends on / blocked
 
Reported: 2021-12-03 17:10 UTC by Aleksei Bavshin
Modified: 2022-03-10 05:40 UTC (History)
2 users (show)

Fixed In Version: rust-tree-sitter-0.20.2-1.fc37
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-10 05:40:35 UTC
Type: ---
Embargoed:
decathorpe: fedora-review+


Attachments (Terms of Use)

Description Aleksei Bavshin 2021-12-03 17:10:47 UTC
Spec URL: https://alebastr.fedorapeople.org/review/tree-sitter-cli/rust-tree-sitter.spec
SRPM URL: https://download.copr.fedorainfracloud.org/results/alebastr/rust-tree-sitter-cli/fedora-rawhide-x86_64/02999952-rust-tree-sitter/rust-tree-sitter-0.20.1-2.fc36.src.rpm
Copr build: https://copr.fedorainfracloud.org/coprs/alebastr/rust-tree-sitter-cli/build/2999952/
Description:
Rust bindings to the Tree-sitter parsing library.

Fedora Account System Username: alebastr
Note: generated with `rust2rpm -a`; added upstream license file as required by the MIT license

Comment 1 Fabio Valentini 2021-12-05 11:32:41 UTC
This crate seems to contain a bundled copy of tree-sitter C library? You will need to declare that in the .spec file somehow (and make sure the license is also acceptable).

Additionally, the Rust bindings are generated sources (with bindgen), which is not permissible in Fedora.

Generated sources must be regenerated at build time, but there is neither a build.rs script that supports doing that, nor can we be sure that the generated sources haven't been "tampered" with post-generation.

At any rate, using pre-generated sources from bindgen is a bad idea in general, because they encode architecture-specific details from the machine that bindgen was run on (i.e. running bindgen on x86_64 machine often produces bindings that are wrong or broken on 32-bit architectures, like i686 or arm).

You'll need to ask upstream for a way to regenerate those sources at build time (probably a `build.rs` script with Cargo feature flags to turn regenerating bindings on or off, or `/usr/bin/bindgen` invocation). You can look at other -sys crates how they handle this case (to make sure bindgen is run not only during the create build itself, but also when building the crate in dependent packages, regardless of chosen feature set).

Comment 2 Fabio Valentini 2021-12-05 11:45:42 UTC
Oh, additionally, we need to make sure the C code is actually compiled with Fedora CFLAGS and LDFLAGS. Not sure how to pass those through to the cc crate (it might read CFLAGS etc. from the environment correctly; if that's the case, then %set_build_flags in %build should do it, but that won't be applied in builds of dependent packages, so some other mechanism will need to be used.)

Comment 3 Aleksei Bavshin 2021-12-05 11:48:21 UTC
(In reply to Fabio Valentini from comment #1)
> This crate seems to contain a bundled copy of tree-sitter C library? You
> will need to declare that in the .spec file somehow (and make sure the
> license is also acceptable).

The license is not an issue, everything is coming from the same repository under a common MIT license.
Unbundling the library doesn't seem to be possible, as one of the provided features depends on compiling the library with a different set of macros.  I'll add `bundled(libtree-sitter) = %{version}`.
Btw, do you know if rust-cc is applying our recommended CFLAGS? Do I need to add anything extra for that?

> 
> Additionally, the Rust bindings are generated sources (with bindgen), which
> is not permissible in Fedora.
> 
> Generated sources must be regenerated at build time, but there is neither a
> build.rs script that supports doing that, nor can we be sure that the
> generated sources haven't been "tampered" with post-generation.
> 
> At any rate, using pre-generated sources from bindgen is a bad idea in
> general, because they encode architecture-specific details from the machine
> that bindgen was run on (i.e. running bindgen on x86_64 machine often
> produces bindings that are wrong or broken on 32-bit architectures, like
> i686 or arm).
> 
> You'll need to ask upstream for a way to regenerate those sources at build
> time (probably a `build.rs` script with Cargo feature flags to turn
> regenerating bindings on or off, or `/usr/bin/bindgen` invocation). You can
> look at other -sys crates how they handle this case (to make sure bindgen is
> run not only during the create build itself, but also when building the
> crate in dependent packages, regardless of chosen feature set).

Uh... no, those were pre-generated 3 years ago. I'm not sure we can still call it pre-generated after 3 years of manual edits :(
Which also means there's no way to add bindgen to the upstream build process.

Comment 4 Aleksei Bavshin 2021-12-08 06:25:50 UTC
Spec URL: https://alebastr.fedorapeople.org/review/tree-sitter-cli/rust-tree-sitter.spec
SRPM URL: https://download.copr.fedorainfracloud.org/results/alebastr/rust-tree-sitter-cli/fedora-rawhide-x86_64/03012110-rust-tree-sitter/rust-tree-sitter-0.20.1-3.fc36.src.rpm

Copr Build: https://copr.fedorainfracloud.org/coprs/alebastr/rust-tree-sitter-cli/build/3012110/
Exploded view with patches: https://copr-dist-git.fedorainfracloud.org/cgit/alebastr/rust-tree-sitter-cli/rust-tree-sitter.git/tree/?id=459c0decd4555b1153bf1f791a720cf20d8af48a

Changes:
- Added the `generate-bindings` feature and a patch to make it default (upstream PR is sent[1], waiting for a feedback).
  Invoking /usr/bin/bindgen doesn't really work, as it will create the bindings for a build machine arch, and our rust library packages are all noarch.
- Added %%set_build_flags to verify that the C code compiles with recommended flags.
- Added bundled(tree-sitter) to the `devel` subpackage.

I run upstream unit tests on all our supported arches (except of s390x; LLVM keeps crashing in the copr chroot), and everything passed.
***

> Generated sources must be regenerated at build time.

Actually, no. The Guidelines only *suggest* that[1]. We have a stricter policy for Python, but not for Rust.

[1]: https://github.com/tree-sitter/tree-sitter/pull/1524
[2]: https://docs.fedoraproject.org/en-US/packaging-guidelines/what-can-be-packaged/#_pregenerated_code

Comment 5 Fabio Valentini 2021-12-12 11:31:32 UTC
Thanks, the PR for upstream looks good.

The PR for the rust macros to always set CFLAGS etc. has also been merged, I think we can expect a new release with that functionality soon.

Comment 6 Fabio Valentini 2022-02-22 15:54:29 UTC
Can you update the package to the latest version?
With rust2rpm 21, everything should be ready otherwise, I think ...

Comment 7 Aleksei Bavshin 2022-02-24 06:16:34 UTC
Spec URL: https://alebastr.fedorapeople.org/review/tree-sitter-cli/rust-tree-sitter.spec
SRPM URL: https://download.copr.fedorainfracloud.org/results/alebastr/rust-tree-sitter-cli/fedora-rawhide-x86_64/03540803-rust-tree-sitter/rust-tree-sitter-0.20.2-1.fc37.src.rpm

0.20.2 is not the latest version, but 0.20.3+ generates parser code incompatible with tree-sitter-0.20.1 we have in rawhide and fails the test suite. I'll have to coordinate further updates with the C library package.

Comment 8 Fabio Valentini 2022-02-24 14:38:55 UTC
Looks good to me, with one exception:
I just noticed the included tree-sitter C library code bundles parts of icu,
so you'll need to add "Provides: bundled(libicu) = version" to the -devel package, as well.

The ICU license file is already included, but you'll need to fix the license tag to incorporate the license of the bundled ICU sources, which seems to be a version of the "Unicode" license: https://fedoraproject.org/wiki/Licensing/Unicode (so it should probably be "License: MIT and Unicode").

Comment 9 Fabio Valentini 2022-02-24 15:40:18 UTC
Package was generated with rust2rpm, simplifying the review.

- package builds and installs without errors on rawhide
- test suite is run and all unit tests pass (there are zero tests)
- latest (possible) version of the crate is packaged
- license matches upstream specification (MIT + Unicode) and is acceptable for Fedora
- license files are included with %license in %files
- package complies with Rust Packaging Guidelines

Package APPROVED.

===

Recommended post-import rust-sig tasks:

- add @rust-sig with "commit" access as package co-maintainer

- set bugzilla assignee overrides to @rust-sig (optional)

- set up package on release-monitoring.org:
  project: $crate
  homepage: https://crates.io/crates/$crate
  backend: crates.io
  version scheme: semantic
  version filter: alpha;beta;rc;pre
  distro: Fedora
  Package: rust-$crate

- track package in koschei for all built branches

Comment 10 Gwyn Ciesla 2022-02-24 20:45:48 UTC
(fedscm-admin):  The Pagure repository was created at https://src.fedoraproject.org/rpms/rust-tree-sitter


Note You need to log in before you can comment on or make changes to this bug.