Bug 2280802 - Review Request: rust-scraper - HTML parsing and querying with CSS selectors
Summary: Review Request: rust-scraper - HTML parsing and querying with CSS selectors
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: Package Review
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Fabio Valentini
QA Contact: Fedora Extras Quality Assurance
URL: https://crates.io/crates/scraper
Whiteboard:
Depends On: 2280801
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-05-16 10:25 UTC by Gustavo Costa
Modified: 2025-04-13 20:30 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2025-04-13 20:30:56 UTC
Type: ---
Embargoed:
decathorpe: fedora-review+


Attachments (Terms of Use)
The .spec file difference from Copr build 8371532 to 8379479 (885 bytes, patch)
2024-12-12 00:25 UTC, Fedora Review Service
no flags Details | Diff
The .spec file difference from Copr build 8379479 to 8863567 (584 bytes, patch)
2025-04-05 16:22 UTC, Fedora Review Service
no flags Details | Diff

Description Gustavo Costa 2024-05-16 10:25:33 UTC
Spec URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper.spec
SRPM URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper-0.19.0-1.fc41.src.rpm

Description:
HTML parsing and querying with CSS selectors

Fedora Account System Username: xfgusta

Comment 1 Fedora Review Service 2024-05-16 11:16:40 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/7449615
(failed)

Build log:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2280802-rust-scraper/fedora-rawhide-x86_64/07449615-rust-scraper/builder-live.log.gz

Please make sure the package builds successfully at least for Fedora Rawhide.

- If the build failed for unrelated reasons (e.g. temporary network
  unavailability), please ignore it.
- If the build failed because of missing BuildRequires, please make sure they
  are listed in the "Depends On" field


---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 2 Gustavo Costa 2024-05-16 11:33:31 UTC
Build failed because of missing crate(ego-tree). "Depends on" already added.

[fedora-review-service-build]

Comment 3 Fedora Review Service 2024-05-16 11:36:14 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/7449643
(failed)

Build log:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2280802-rust-scraper/fedora-rawhide-x86_64/07449643-rust-scraper/builder-live.log.gz

Please make sure the package builds successfully at least for Fedora Rawhide.

- If the build failed for unrelated reasons (e.g. temporary network
  unavailability), please ignore it.
- If the build failed because of missing BuildRequires, please make sure they
  are listed in the "Depends On" field


---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 4 Gustavo Costa 2024-05-16 11:58:25 UTC
Still failing for the same reason:

> Problem: nothing provides requested (crate(ego-tree/default) >= 0.6.2 with crate(ego-tree/default) < 0.7.0~)

Is it a bug on fedora-review-service? Anyway, here's a successful build on Copr: https://copr.fedorainfracloud.org/coprs/xfgusta/asciinema/build/7448106/

Comment 5 Fabio Valentini 2024-05-16 12:06:20 UTC
> Is it a bug on fedora-review-service?

Yes.

> - If the build failed because of missing BuildRequires, please make sure they
>  are listed in the "Depends On" field

This is a lie - the functionality for taking into accoutn "Depends on" metadata is not implemented yet.

Comment 6 Gustavo Costa 2024-05-17 22:56:16 UTC
I just noticed that upstream provides a man page for the 'scraper' binary. I've 
updated the spec and srpm files with this addition.

Comment 7 Fabio Valentini 2024-06-13 13:38:44 UTC
Are you actually interested in shipping the "scraper" binary or do you only need the Rust crate / library interface?

If it's just the latter, things would be easier if you configured rust2rpm to not build / ship the binary.

You could use this config snippet in rust2rpm.toml:

```
[package]
cargo-install-bin = false
```

Note that if you want to ship the executable too, you will need to rename it.
Another package already provides /usr/bin/scraper (perl-Web-Scraper).

rust2rpm.toml config files also provide a setting for renaming binaries when needed, please refer to the man page for more information.

Comment 8 Fabio Valentini 2024-10-13 20:16:06 UTC
Are you still interested in getting a review for this package?

Comment 9 Fabio Valentini 2024-10-15 12:37:29 UTC
This package now no longer builds on Rawhide because the ego-tree crate was updated to v0.9.0, while this package needs v0.6.*.

Comment 10 Gustavo Costa 2024-10-15 13:41:27 UTC
Hi Fabio, sorry for the long delay in replying. I was feeling drained and took a break from my Fedora contributions. I only tried to keep some packages updated during this time, but I should have replied here as well.

> Are you actually interested in shipping the "scraper" binary or do you only need the Rust crate / library interface?

Only the library, for now. I'll remove the binary from the installation as you suggested.

> Are you still interested in getting a review for this package?

Yes. I'll need scraper as a dependency for the upcoming asciinema 3.0, which is written in Rust. However, asciinema development has slowed down, so I didn’t rush to get scraper reviewed.

> This package now no longer builds on Rawhide because the ego-tree crate was updated to v0.9.0

I definitely should have checked this before updating. It wasn't building because of html5ever 0.29.0 too; scraper uses 0.27.0. Some code changes will be needed. When I get home, I'll look into what can be done to fix it.

Comment 11 Fabio Valentini 2024-10-16 19:49:09 UTC
> It wasn't building because of html5ever 0.29.0 too; scraper uses 0.27.0.

I've ported some other packages for html5ever / xml5ever changes, the required code changes to support the latest version were usually very small, so I hope that will be the case here too.

Comment 12 Gustavo Costa 2024-12-09 23:59:28 UTC
Hi Fabio,

I was able to build scraper using the html5ever 0.26 compat package. Upstream updated html5ever to 0.29 along with other changes, and I couldn't figure out how to patch it. My goal with this package is to build asciinema 3.0, which I successfully did, so everything seems to be working fine

I updated the package to bump ego-tree from 0.6 to 0.9 as well as to remove the scraper binary from installation

Spec URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper.spec
SRPM URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper-0.19.0-1.fc42.src.rpm

Scratch build: https://koji.fedoraproject.org/koji/taskinfo?taskID=126656366

rust2rpm.toml:

```
[package]
cargo-install-bin = false
```

Comment 13 Fedora Review Service 2024-12-10 05:19:03 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/8371532
(succeeded)

Review template:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2280802-rust-scraper/fedora-rawhide-x86_64/08371532-rust-scraper/fedora-review/review.txt

Please take a look if any issues were found.


---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 14 Fabio Valentini 2024-12-10 13:13:09 UTC
Thanks for the update!

Can you verify that you still need scraper 0.19.*? The latest release is now 0.22.0.
And if you still need 0.19.*, then please update from 0.19.0 to 0.19.1.

Comment 15 Gustavo Costa 2024-12-12 00:16:12 UTC
> Can you verify that you still need scraper 0.19.*? The latest release is now 0.22.0.
> And if you still need 0.19.*, then please update from 0.19.0 to 0.19.1.

I can update it to 0.20.0, anything higher than this would require cssparser 0.34.0 (current version on rawhide is 0.31)

Spec URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper.spec
SRPM URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper-0.20.0-1.fc42.src.rpm

Scratch build: https://koji.fedoraproject.org/koji/taskinfo?taskID=126734877

Comment 16 Fedora Review Service 2024-12-12 00:25:35 UTC
Created attachment 2062132 [details]
The .spec file difference from Copr build 8371532 to 8379479

Comment 17 Fedora Review Service 2024-12-12 00:25:37 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/8379479
(succeeded)

Review template:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2280802-rust-scraper/fedora-rawhide-x86_64/08379479-rust-scraper/fedora-review/review.txt

Please take a look if any issues were found.


---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 18 Fabio Valentini 2024-12-27 16:07:41 UTC
(In reply to Gustavo Costa from comment #15)
> > Can you verify that you still need scraper 0.19.*? The latest release is now 0.22.0.
> > And if you still need 0.19.*, then please update from 0.19.0 to 0.19.1.
> 
> I can update it to 0.20.0, anything higher than this would require cssparser
> 0.34.0 (current version on rawhide is 0.31)

I didn't mean whether you *can*, I meant whether asciinema would work with scraper newer than 0.19.

According to crates.io, it still depends on 0.19.x:
https://crates.io/crates/asciinema/3.0.0-rc.3/dependencies

Comment 19 Fabio Valentini 2025-03-29 22:56:00 UTC
Please update this ticket if you're still interested in this package.

Notably, asciinema 3.0.0-rc.3 is still the latest release, and still pulls in scraper 0.19.x, so it would likely be better to back out of the 0.20 update for now.

Comment 20 Gustavo Costa 2025-04-05 16:13:18 UTC
Hi,

I'm still interested in the package. Here's rust-scraper-0.19.1:

Spec URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper.spec
SRPM URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper-0.19.1-1.fc43.src.rpm

Scratch build: https://koji.fedoraproject.org/koji/taskinfo?taskID=131142568

rust2rpm.toml:

```
[package]
cargo-install-bin = false
```

Although 0.20.0 also works with the latest asciinema, I agree it's better to follow upstream and stick with 0.19.x. I can update it later if needed.

Thanks for the review.

Comment 21 Fedora Review Service 2025-04-05 16:22:37 UTC
Created attachment 2083532 [details]
The .spec file difference from Copr build 8379479 to 8863567

Comment 22 Fedora Review Service 2025-04-05 16:22:39 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/8863567
(succeeded)

Review template:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2280802-rust-scraper/fedora-rawhide-x86_64/08863567-rust-scraper/fedora-review/review.txt

Please take a look if any issues were found.


---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 23 Fabio Valentini 2025-04-09 11:26:32 UTC
Thank you, this looks good to me now. Sorry that this has taken so long.

You can permanently store some modifications in rust2rpm.toml:

```
[package]
cargo-toml-patch-comments = [
    "Bump ego-tree dependency from 0.6.2 to 0.9.0",
    "Relax html5ever dependency from 0.27 to 0.26",
]
cargo-install-bin = false
```

I recommend that you commit this into dist-git upon importing. :)

===

Package was generated with rust2rpm, simplifying the review.

✅ package contains only permissible content
✅ package builds and installs without errors on rawhide
✅ test suite is run and all unit tests pass
🫤 latest version of the crate is packaged
   (0.19.1 is not the latest, but it is what is currently needed)
✅ license matches upstream specification and is acceptable for Fedora
✅ license file is included with %license in %files
✅ package complies with Rust Packaging Guidelines

Package APPROVED.

===

Recommended post-import rust-sig tasks:

- set up package on release-monitoring.org:
  project: $crate
  homepage: https://crates.io/crates/$crate
  backend: crates.io
  version scheme: semantic
  version filter (*NOT* pre-release filter): alpha;beta;rc;pre
  distro: Fedora
  Package: rust-$crate

- set bugzilla assignee overrides to @rust-sig (optional)

Comment 24 Fedora Admin user for bugzilla script actions 2025-04-13 19:44:57 UTC
The Pagure repository was created at https://src.fedoraproject.org/rpms/rust-scraper

Comment 25 Fedora Update System 2025-04-13 20:26:11 UTC
FEDORA-2025-a52badbf77 (rust-scraper-0.19.1-1.fc43) has been submitted as an update to Fedora 43.
https://bodhi.fedoraproject.org/updates/FEDORA-2025-a52badbf77

Comment 26 Fedora Update System 2025-04-13 20:30:56 UTC
FEDORA-2025-a52badbf77 (rust-scraper-0.19.1-1.fc43) has been pushed to the Fedora 43 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.