Bug 2280802

Summary: Review Request: rust-scraper - HTML parsing and querying with CSS selectors
Product: [Fedora] Fedora Reporter: Gustavo Costa <xfgusta>
Component: Package ReviewAssignee: Fabio Valentini <decathorpe>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: decathorpe, package-review
Target Milestone: ---Keywords: AutomationTriaged
Target Release: ---Flags: decathorpe: fedora-review+
Hardware: All   
OS: Linux   
URL: https://crates.io/crates/scraper
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2025-04-13 20:30:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2280801    
Bug Blocks:    
Attachments:
Description Flags
The .spec file difference from Copr build 8371532 to 8379479
none
The .spec file difference from Copr build 8379479 to 8863567 none

Description Gustavo Costa 2024-05-16 10:25:33 UTC
Spec URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper.spec
SRPM URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper-0.19.0-1.fc41.src.rpm

Description:
HTML parsing and querying with CSS selectors

Fedora Account System Username: xfgusta

Comment 1 Fedora Review Service 2024-05-16 11:16:40 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/7449615
(failed)

Build log:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2280802-rust-scraper/fedora-rawhide-x86_64/07449615-rust-scraper/builder-live.log.gz

Please make sure the package builds successfully at least for Fedora Rawhide.

- If the build failed for unrelated reasons (e.g. temporary network
  unavailability), please ignore it.
- If the build failed because of missing BuildRequires, please make sure they
  are listed in the "Depends On" field


---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 2 Gustavo Costa 2024-05-16 11:33:31 UTC
Build failed because of missing crate(ego-tree). "Depends on" already added.

[fedora-review-service-build]

Comment 3 Fedora Review Service 2024-05-16 11:36:14 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/7449643
(failed)

Build log:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2280802-rust-scraper/fedora-rawhide-x86_64/07449643-rust-scraper/builder-live.log.gz

Please make sure the package builds successfully at least for Fedora Rawhide.

- If the build failed for unrelated reasons (e.g. temporary network
  unavailability), please ignore it.
- If the build failed because of missing BuildRequires, please make sure they
  are listed in the "Depends On" field


---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 4 Gustavo Costa 2024-05-16 11:58:25 UTC
Still failing for the same reason:

> Problem: nothing provides requested (crate(ego-tree/default) >= 0.6.2 with crate(ego-tree/default) < 0.7.0~)

Is it a bug on fedora-review-service? Anyway, here's a successful build on Copr: https://copr.fedorainfracloud.org/coprs/xfgusta/asciinema/build/7448106/

Comment 5 Fabio Valentini 2024-05-16 12:06:20 UTC
> Is it a bug on fedora-review-service?

Yes.

> - If the build failed because of missing BuildRequires, please make sure they
>  are listed in the "Depends On" field

This is a lie - the functionality for taking into accoutn "Depends on" metadata is not implemented yet.

Comment 6 Gustavo Costa 2024-05-17 22:56:16 UTC
I just noticed that upstream provides a man page for the 'scraper' binary. I've 
updated the spec and srpm files with this addition.

Comment 7 Fabio Valentini 2024-06-13 13:38:44 UTC
Are you actually interested in shipping the "scraper" binary or do you only need the Rust crate / library interface?

If it's just the latter, things would be easier if you configured rust2rpm to not build / ship the binary.

You could use this config snippet in rust2rpm.toml:

```
[package]
cargo-install-bin = false
```

Note that if you want to ship the executable too, you will need to rename it.
Another package already provides /usr/bin/scraper (perl-Web-Scraper).

rust2rpm.toml config files also provide a setting for renaming binaries when needed, please refer to the man page for more information.

Comment 8 Fabio Valentini 2024-10-13 20:16:06 UTC
Are you still interested in getting a review for this package?

Comment 9 Fabio Valentini 2024-10-15 12:37:29 UTC
This package now no longer builds on Rawhide because the ego-tree crate was updated to v0.9.0, while this package needs v0.6.*.

Comment 10 Gustavo Costa 2024-10-15 13:41:27 UTC
Hi Fabio, sorry for the long delay in replying. I was feeling drained and took a break from my Fedora contributions. I only tried to keep some packages updated during this time, but I should have replied here as well.

> Are you actually interested in shipping the "scraper" binary or do you only need the Rust crate / library interface?

Only the library, for now. I'll remove the binary from the installation as you suggested.

> Are you still interested in getting a review for this package?

Yes. I'll need scraper as a dependency for the upcoming asciinema 3.0, which is written in Rust. However, asciinema development has slowed down, so I didn’t rush to get scraper reviewed.

> This package now no longer builds on Rawhide because the ego-tree crate was updated to v0.9.0

I definitely should have checked this before updating. It wasn't building because of html5ever 0.29.0 too; scraper uses 0.27.0. Some code changes will be needed. When I get home, I'll look into what can be done to fix it.

Comment 11 Fabio Valentini 2024-10-16 19:49:09 UTC
> It wasn't building because of html5ever 0.29.0 too; scraper uses 0.27.0.

I've ported some other packages for html5ever / xml5ever changes, the required code changes to support the latest version were usually very small, so I hope that will be the case here too.

Comment 12 Gustavo Costa 2024-12-09 23:59:28 UTC
Hi Fabio,

I was able to build scraper using the html5ever 0.26 compat package. Upstream updated html5ever to 0.29 along with other changes, and I couldn't figure out how to patch it. My goal with this package is to build asciinema 3.0, which I successfully did, so everything seems to be working fine

I updated the package to bump ego-tree from 0.6 to 0.9 as well as to remove the scraper binary from installation

Spec URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper.spec
SRPM URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper-0.19.0-1.fc42.src.rpm

Scratch build: https://koji.fedoraproject.org/koji/taskinfo?taskID=126656366

rust2rpm.toml:

```
[package]
cargo-install-bin = false
```

Comment 13 Fedora Review Service 2024-12-10 05:19:03 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/8371532
(succeeded)

Review template:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2280802-rust-scraper/fedora-rawhide-x86_64/08371532-rust-scraper/fedora-review/review.txt

Please take a look if any issues were found.


---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 14 Fabio Valentini 2024-12-10 13:13:09 UTC
Thanks for the update!

Can you verify that you still need scraper 0.19.*? The latest release is now 0.22.0.
And if you still need 0.19.*, then please update from 0.19.0 to 0.19.1.

Comment 15 Gustavo Costa 2024-12-12 00:16:12 UTC
> Can you verify that you still need scraper 0.19.*? The latest release is now 0.22.0.
> And if you still need 0.19.*, then please update from 0.19.0 to 0.19.1.

I can update it to 0.20.0, anything higher than this would require cssparser 0.34.0 (current version on rawhide is 0.31)

Spec URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper.spec
SRPM URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper-0.20.0-1.fc42.src.rpm

Scratch build: https://koji.fedoraproject.org/koji/taskinfo?taskID=126734877

Comment 16 Fedora Review Service 2024-12-12 00:25:35 UTC
Created attachment 2062132 [details]
The .spec file difference from Copr build 8371532 to 8379479

Comment 17 Fedora Review Service 2024-12-12 00:25:37 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/8379479
(succeeded)

Review template:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2280802-rust-scraper/fedora-rawhide-x86_64/08379479-rust-scraper/fedora-review/review.txt

Please take a look if any issues were found.


---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 18 Fabio Valentini 2024-12-27 16:07:41 UTC
(In reply to Gustavo Costa from comment #15)
> > Can you verify that you still need scraper 0.19.*? The latest release is now 0.22.0.
> > And if you still need 0.19.*, then please update from 0.19.0 to 0.19.1.
> 
> I can update it to 0.20.0, anything higher than this would require cssparser
> 0.34.0 (current version on rawhide is 0.31)

I didn't mean whether you *can*, I meant whether asciinema would work with scraper newer than 0.19.

According to crates.io, it still depends on 0.19.x:
https://crates.io/crates/asciinema/3.0.0-rc.3/dependencies

Comment 19 Fabio Valentini 2025-03-29 22:56:00 UTC
Please update this ticket if you're still interested in this package.

Notably, asciinema 3.0.0-rc.3 is still the latest release, and still pulls in scraper 0.19.x, so it would likely be better to back out of the 0.20 update for now.

Comment 20 Gustavo Costa 2025-04-05 16:13:18 UTC
Hi,

I'm still interested in the package. Here's rust-scraper-0.19.1:

Spec URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper.spec
SRPM URL: https://xfgusta.fedorapeople.org/pkgs/rust-scraper-0.19.1-1.fc43.src.rpm

Scratch build: https://koji.fedoraproject.org/koji/taskinfo?taskID=131142568

rust2rpm.toml:

```
[package]
cargo-install-bin = false
```

Although 0.20.0 also works with the latest asciinema, I agree it's better to follow upstream and stick with 0.19.x. I can update it later if needed.

Thanks for the review.

Comment 21 Fedora Review Service 2025-04-05 16:22:37 UTC
Created attachment 2083532 [details]
The .spec file difference from Copr build 8379479 to 8863567

Comment 22 Fedora Review Service 2025-04-05 16:22:39 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/8863567
(succeeded)

Review template:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2280802-rust-scraper/fedora-rawhide-x86_64/08863567-rust-scraper/fedora-review/review.txt

Please take a look if any issues were found.


---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 23 Fabio Valentini 2025-04-09 11:26:32 UTC
Thank you, this looks good to me now. Sorry that this has taken so long.

You can permanently store some modifications in rust2rpm.toml:

```
[package]
cargo-toml-patch-comments = [
    "Bump ego-tree dependency from 0.6.2 to 0.9.0",
    "Relax html5ever dependency from 0.27 to 0.26",
]
cargo-install-bin = false
```

I recommend that you commit this into dist-git upon importing. :)

===

Package was generated with rust2rpm, simplifying the review.

✅ package contains only permissible content
✅ package builds and installs without errors on rawhide
✅ test suite is run and all unit tests pass
🫤 latest version of the crate is packaged
   (0.19.1 is not the latest, but it is what is currently needed)
✅ license matches upstream specification and is acceptable for Fedora
✅ license file is included with %license in %files
✅ package complies with Rust Packaging Guidelines

Package APPROVED.

===

Recommended post-import rust-sig tasks:

- set up package on release-monitoring.org:
  project: $crate
  homepage: https://crates.io/crates/$crate
  backend: crates.io
  version scheme: semantic
  version filter (*NOT* pre-release filter): alpha;beta;rc;pre
  distro: Fedora
  Package: rust-$crate

- set bugzilla assignee overrides to @rust-sig (optional)

Comment 24 Fedora Admin user for bugzilla script actions 2025-04-13 19:44:57 UTC
The Pagure repository was created at https://src.fedoraproject.org/rpms/rust-scraper

Comment 25 Fedora Update System 2025-04-13 20:26:11 UTC
FEDORA-2025-a52badbf77 (rust-scraper-0.19.1-1.fc43) has been submitted as an update to Fedora 43.
https://bodhi.fedoraproject.org/updates/FEDORA-2025-a52badbf77

Comment 26 Fedora Update System 2025-04-13 20:30:56 UTC
FEDORA-2025-a52badbf77 (rust-scraper-0.19.1-1.fc43) has been pushed to the Fedora 43 stable repository.
If problem still persists, please make note of it in this bug report.