Bug 2280802
Summary: | Review Request: rust-scraper - HTML parsing and querying with CSS selectors | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Gustavo Costa <xfgusta> | ||||||
Component: | Package Review | Assignee: | Fabio Valentini <decathorpe> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | rawhide | CC: | decathorpe, package-review | ||||||
Target Milestone: | --- | Keywords: | AutomationTriaged | ||||||
Target Release: | --- | Flags: | decathorpe:
fedora-review+
|
||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
URL: | https://crates.io/crates/scraper | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2025-04-13 20:30:56 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 2280801 | ||||||||
Bug Blocks: | |||||||||
Attachments: |
|
Description
Gustavo Costa
2024-05-16 10:25:33 UTC
Copr build: https://copr.fedorainfracloud.org/coprs/build/7449615 (failed) Build log: https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2280802-rust-scraper/fedora-rawhide-x86_64/07449615-rust-scraper/builder-live.log.gz Please make sure the package builds successfully at least for Fedora Rawhide. - If the build failed for unrelated reasons (e.g. temporary network unavailability), please ignore it. - If the build failed because of missing BuildRequires, please make sure they are listed in the "Depends On" field --- This comment was created by the fedora-review-service https://github.com/FrostyX/fedora-review-service If you want to trigger a new Copr build, add a comment containing new Spec and SRPM URLs or [fedora-review-service-build] string. Build failed because of missing crate(ego-tree). "Depends on" already added. [fedora-review-service-build] Copr build: https://copr.fedorainfracloud.org/coprs/build/7449643 (failed) Build log: https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2280802-rust-scraper/fedora-rawhide-x86_64/07449643-rust-scraper/builder-live.log.gz Please make sure the package builds successfully at least for Fedora Rawhide. - If the build failed for unrelated reasons (e.g. temporary network unavailability), please ignore it. - If the build failed because of missing BuildRequires, please make sure they are listed in the "Depends On" field --- This comment was created by the fedora-review-service https://github.com/FrostyX/fedora-review-service If you want to trigger a new Copr build, add a comment containing new Spec and SRPM URLs or [fedora-review-service-build] string. Still failing for the same reason: > Problem: nothing provides requested (crate(ego-tree/default) >= 0.6.2 with crate(ego-tree/default) < 0.7.0~) Is it a bug on fedora-review-service? Anyway, here's a successful build on Copr: https://copr.fedorainfracloud.org/coprs/xfgusta/asciinema/build/7448106/ > Is it a bug on fedora-review-service? Yes. > - If the build failed because of missing BuildRequires, please make sure they > are listed in the "Depends On" field This is a lie - the functionality for taking into accoutn "Depends on" metadata is not implemented yet. I just noticed that upstream provides a man page for the 'scraper' binary. I've updated the spec and srpm files with this addition. Are you actually interested in shipping the "scraper" binary or do you only need the Rust crate / library interface? If it's just the latter, things would be easier if you configured rust2rpm to not build / ship the binary. You could use this config snippet in rust2rpm.toml: ``` [package] cargo-install-bin = false ``` Note that if you want to ship the executable too, you will need to rename it. Another package already provides /usr/bin/scraper (perl-Web-Scraper). rust2rpm.toml config files also provide a setting for renaming binaries when needed, please refer to the man page for more information. Are you still interested in getting a review for this package? This package now no longer builds on Rawhide because the ego-tree crate was updated to v0.9.0, while this package needs v0.6.*. Hi Fabio, sorry for the long delay in replying. I was feeling drained and took a break from my Fedora contributions. I only tried to keep some packages updated during this time, but I should have replied here as well. > Are you actually interested in shipping the "scraper" binary or do you only need the Rust crate / library interface? Only the library, for now. I'll remove the binary from the installation as you suggested. > Are you still interested in getting a review for this package? Yes. I'll need scraper as a dependency for the upcoming asciinema 3.0, which is written in Rust. However, asciinema development has slowed down, so I didn’t rush to get scraper reviewed. > This package now no longer builds on Rawhide because the ego-tree crate was updated to v0.9.0 I definitely should have checked this before updating. It wasn't building because of html5ever 0.29.0 too; scraper uses 0.27.0. Some code changes will be needed. When I get home, I'll look into what can be done to fix it. > It wasn't building because of html5ever 0.29.0 too; scraper uses 0.27.0.
I've ported some other packages for html5ever / xml5ever changes, the required code changes to support the latest version were usually very small, so I hope that will be the case here too.
Hi Fabio, I was able to build scraper using the html5ever 0.26 compat package. Upstream updated html5ever to 0.29 along with other changes, and I couldn't figure out how to patch it. My goal with this package is to build asciinema 3.0, which I successfully did, so everything seems to be working fine I updated the package to bump ego-tree from 0.6 to 0.9 as well as to remove the scraper binary from installation Spec URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper.spec SRPM URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper-0.19.0-1.fc42.src.rpm Scratch build: https://koji.fedoraproject.org/koji/taskinfo?taskID=126656366 rust2rpm.toml: ``` [package] cargo-install-bin = false ``` Copr build: https://copr.fedorainfracloud.org/coprs/build/8371532 (succeeded) Review template: https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2280802-rust-scraper/fedora-rawhide-x86_64/08371532-rust-scraper/fedora-review/review.txt Please take a look if any issues were found. --- This comment was created by the fedora-review-service https://github.com/FrostyX/fedora-review-service If you want to trigger a new Copr build, add a comment containing new Spec and SRPM URLs or [fedora-review-service-build] string. Thanks for the update! Can you verify that you still need scraper 0.19.*? The latest release is now 0.22.0. And if you still need 0.19.*, then please update from 0.19.0 to 0.19.1. > Can you verify that you still need scraper 0.19.*? The latest release is now 0.22.0. > And if you still need 0.19.*, then please update from 0.19.0 to 0.19.1. I can update it to 0.20.0, anything higher than this would require cssparser 0.34.0 (current version on rawhide is 0.31) Spec URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper.spec SRPM URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper-0.20.0-1.fc42.src.rpm Scratch build: https://koji.fedoraproject.org/koji/taskinfo?taskID=126734877 Created attachment 2062132 [details]
The .spec file difference from Copr build 8371532 to 8379479
Copr build: https://copr.fedorainfracloud.org/coprs/build/8379479 (succeeded) Review template: https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2280802-rust-scraper/fedora-rawhide-x86_64/08379479-rust-scraper/fedora-review/review.txt Please take a look if any issues were found. --- This comment was created by the fedora-review-service https://github.com/FrostyX/fedora-review-service If you want to trigger a new Copr build, add a comment containing new Spec and SRPM URLs or [fedora-review-service-build] string. (In reply to Gustavo Costa from comment #15) > > Can you verify that you still need scraper 0.19.*? The latest release is now 0.22.0. > > And if you still need 0.19.*, then please update from 0.19.0 to 0.19.1. > > I can update it to 0.20.0, anything higher than this would require cssparser > 0.34.0 (current version on rawhide is 0.31) I didn't mean whether you *can*, I meant whether asciinema would work with scraper newer than 0.19. According to crates.io, it still depends on 0.19.x: https://crates.io/crates/asciinema/3.0.0-rc.3/dependencies Please update this ticket if you're still interested in this package. Notably, asciinema 3.0.0-rc.3 is still the latest release, and still pulls in scraper 0.19.x, so it would likely be better to back out of the 0.20 update for now. Hi, I'm still interested in the package. Here's rust-scraper-0.19.1: Spec URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper.spec SRPM URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper-0.19.1-1.fc43.src.rpm Scratch build: https://koji.fedoraproject.org/koji/taskinfo?taskID=131142568 rust2rpm.toml: ``` [package] cargo-install-bin = false ``` Although 0.20.0 also works with the latest asciinema, I agree it's better to follow upstream and stick with 0.19.x. I can update it later if needed. Thanks for the review. Created attachment 2083532 [details]
The .spec file difference from Copr build 8379479 to 8863567
Copr build: https://copr.fedorainfracloud.org/coprs/build/8863567 (succeeded) Review template: https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2280802-rust-scraper/fedora-rawhide-x86_64/08863567-rust-scraper/fedora-review/review.txt Please take a look if any issues were found. --- This comment was created by the fedora-review-service https://github.com/FrostyX/fedora-review-service If you want to trigger a new Copr build, add a comment containing new Spec and SRPM URLs or [fedora-review-service-build] string. Thank you, this looks good to me now. Sorry that this has taken so long. You can permanently store some modifications in rust2rpm.toml: ``` [package] cargo-toml-patch-comments = [ "Bump ego-tree dependency from 0.6.2 to 0.9.0", "Relax html5ever dependency from 0.27 to 0.26", ] cargo-install-bin = false ``` I recommend that you commit this into dist-git upon importing. :) === Package was generated with rust2rpm, simplifying the review. ✅ package contains only permissible content ✅ package builds and installs without errors on rawhide ✅ test suite is run and all unit tests pass 🫤 latest version of the crate is packaged (0.19.1 is not the latest, but it is what is currently needed) ✅ license matches upstream specification and is acceptable for Fedora ✅ license file is included with %license in %files ✅ package complies with Rust Packaging Guidelines Package APPROVED. === Recommended post-import rust-sig tasks: - set up package on release-monitoring.org: project: $crate homepage: https://crates.io/crates/$crate backend: crates.io version scheme: semantic version filter (*NOT* pre-release filter): alpha;beta;rc;pre distro: Fedora Package: rust-$crate - set bugzilla assignee overrides to @rust-sig (optional) The Pagure repository was created at https://src.fedoraproject.org/rpms/rust-scraper FEDORA-2025-a52badbf77 (rust-scraper-0.19.1-1.fc43) has been submitted as an update to Fedora 43. https://bodhi.fedoraproject.org/updates/FEDORA-2025-a52badbf77 FEDORA-2025-a52badbf77 (rust-scraper-0.19.1-1.fc43) has been pushed to the Fedora 43 stable repository. If problem still persists, please make note of it in this bug report. |