Bug 2132982 - Don't use file lists for package resolution
Summary: Don't use file lists for package resolution
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: dnf5
Version: 38
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: rpm-software-management
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-10-07 12:06 UTC by Vít Ondruch
Modified: 2023-05-17 14:01 UTC (History)
14 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2023-05-17 14:01:58 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Vít Ondruch 2022-10-07 12:06:11 UTC
Description of problem:
AFAIK, DNF is using file lists for dependency resolution. This requires bandwidth to download the data as well as memory to process the data. But I wonder if there is any practical reason for this? Because AFAICT for every practical purposes in Fedora, the file lists embedded in primary metadata should be enough.

If there is enough data, we should work to extend the primary metadata with additional file instead. E.g. if there are really some file dependencies outside of /usr/{{,s}bin,lib{,64}} and /etc, then we should add the data during repository processing.

If somebody really wants to do `dnf install /some/random/file`, this is case exceptional enough to download the file list explicitly.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:
DNF downloads file list data and use them for dependency resolution.


Expected results:
DNF does not download the file lists and does package resolutions just with the file lists embedded in primary metadata.


Additional info:
This is essentially attempt to reopen bug 968006

Comment 1 Miro Hrončok 2022-10-07 12:21:18 UTC
FTR:

$ repoquery --repo=rawhide -a --requires | grep '^/' | grep -v '^/usr/bin/' | grep -v '^/usr/sbin/'  | grep -v '^/etc/'
/bin/awk
/bin/bash
/bin/gawk
/bin/mail
/bin/mailx
/bin/mount
/bin/ps
/bin/sed
/bin/sh
/bin/systemctl
/sbin/chkconfig
/sbin/fixfiles
/sbin/fsck
/sbin/install-info
/sbin/ip
/sbin/ldconfig
/sbin/modprobe
/sbin/mount.nfs
/sbin/mount.nfs4
/sbin/nologin
/sbin/restorecon
/sbin/rpc.statd
/sbin/service
/sbin/shutdown
/sbin/zfs
/sbin/zpool
/usr/lib/cmake
/usr/lib/kbd
/usr/lib/ocf/resource.d
/usr/lib64/cmake
/usr/lib64/libnssckbi.so
/usr/lib64/mpich/bin/mpibash_mpich
/usr/lib64/mpich/bin/scorep-config
/usr/lib64/openmpi/bin/mpibash_openmpi
/usr/lib64/openmpi/bin/scorep-config
/usr/libexec/gcr-ssh-askpass
/usr/libexec/virtiofsd
/usr/share/X11/rgb.txt
/usr/share/dict/words
/usr/share/fonts/google-droid-sans-fonts/DroidSans.ttf
/usr/share/fonts/google-droid-sans-fonts/DroidSansFallbackFull.ttf
/usr/share/lightsquid/common.pl
/usr/share/sqlninja/backscan.pl
/usr/share/sqlninja/bruteforce.pl
/usr/share/sqlninja/dirshell.pl
/usr/share/sqlninja/dns.pl
/usr/share/sqlninja/escalation.pl
/usr/share/sqlninja/fingerprint.pl
/usr/share/sqlninja/getdata.pl
/usr/share/sqlninja/icmp.pl
/usr/share/sqlninja/metasploit.pl
/usr/share/sqlninja/resurrectxp.pl
/usr/share/sqlninja/revshell.pl
/usr/share/sqlninja/session.pl
/usr/share/sqlninja/sqlcmd.pl
/usr/share/sqlninja/test.pl
/usr/share/sqlninja/upload.pl
/usr/share/sqlninja/utils.pl

We would need to eradicate those first.

Comment 2 Vít Ondruch 2022-10-07 12:33:52 UTC
(In reply to Miro Hrončok from comment #1)
> FTR:
> 
> $ repoquery --repo=rawhide -a --requires | grep '^/' | grep -v '^/usr/bin/'
> | grep -v '^/usr/sbin/'  | grep -v '^/etc/'

Thx for the query!

> We would need to eradicate those first.

1) This is really just a few packages, so fixing them is peace of cake IMO.
2) My second proposal would be against repoquery_c to actually add these into metadata, but I thought I'll start here :)

Comment 3 Vít Ondruch 2022-10-07 12:37:21 UTC
(In reply to Vít Ondruch from comment #2)
> (In reply to Miro Hrončok from comment #1)
> > FTR:
> > 
> > $ repoquery --repo=rawhide -a --requires | grep '^/' | grep -v '^/usr/bin/'
> > | grep -v '^/usr/sbin/'  | grep -v '^/etc/'
> 
> Thx for the query!

That does not include SRPMs, though. I just did this across .spec files:

~~~
$ grep -P -R 'Requires: /usr/(?!s?bin)'
gnome-keyring.spec:Requires: /usr/libexec/gcr-ssh-askpass
gnu-efi.spec:BuildRequires: /usr/include/gnu/stubs-32.h
krb5.spec:Requires: /usr/share/dict/words
perl-Image-Info.spec:- Requires: rgb, not Requires: /usr/share/X11/rgb.txt
systemtap.spec:Requires: /usr/lib/libc.so
syslinux.spec:BuildRequires: /usr/include/gnu/stubs-32.h
x11vnc.spec:- Add BuildRequires: /usr/include/X11/extensions/XShm.h
xen.spec:BuildRequires: /usr/include/gnu/stubs-32.h
~~~

But that does not cover macros such as `%{_bindir}` which I would use personally.

Comment 4 Panu Matilainen 2022-10-07 13:09:43 UTC
Files in /bin and /sbin are included in primary.xml already. As is all manner of stuff from /usr/share, /var and something from /usr/libexec too, although not all. Back in the original createrepo days this was a clear-cut (short) list of things but I dunno what's up with it now.

Comment 5 Panu Matilainen 2022-10-07 13:25:06 UTC
 okay, this matches my recollection of a short list
https://github.com/rpm-software-management/createrepo_c/blob/af14e164a3e4ab9dfaef1212e852b9ecebc326a2/src/misc.h#L110

The gotcha there is the bin/ rule which ends up pulling all manner of unintended stuff, such as:

<file>/var/www/moodle/web/admin/tool/recyclebin/pix/trash.svg</file>
<file>/var/lib/openas2/bin/start-openas2.sh</file>
<file>/var/spool/hylafax/bin/dict/en</file>
...

...which appears to be exactly what the original createrepo did. But the plot thickens, see https://github.com/rpm-software-management/createrepo_c/blob/af14e164a3e4ab9dfaef1212e852b9ecebc326a2/src/misc.c#L184

> This optimal piece of code cannot be used because of yum...

That's from 2012. Maybe it's time to re-evaluate?

Comment 6 Matthew Miller 2022-10-18 13:34:12 UTC
I'm (very!) in favor of this.

I do think some people will miss `dnf install /some/random/file`. But that seems like something a dedicated plugin could do? Or just be its own special-case for install (and remove?).

Comment 7 Daniel Mach 2022-10-19 07:58:37 UTC
(In reply to Matthew Miller from comment #6)
> I'm (very!) in favor of this.

I've implemented this in dnf5 already:
https://github.com/rpm-software-management/libdnf/pull/1556

> 
> I do think some people will miss `dnf install /some/random/file`. But that
> seems like something a dedicated plugin could do? Or just be its own
> special-case for install (and remove?).

Dependency resolution and selecting input packages are not the same code.
My dnf5 patch loads file lists only when an input contains a file path.
I suppose there are some edge cases to cover, but it is a good start.

This also impacts bug#1907030, check how memory consumption is reduced in my comment bug#1907030#c19.
It also decreases average dnf5 load times, because it doesn't load unused repodata.

You're welcome :)

Comment 8 Ben Cotton 2023-02-07 14:57:03 UTC
This bug appears to have been reported against 'rawhide' during the Fedora Linux 38 development cycle.
Changing version to 38.

Comment 9 Jaroslav Mracek 2023-05-17 14:01:58 UTC
The feature is implemented in DNF5, and DNF5 is already in Fedora 38. DNF5 will replace DNF in Fedora 39.

We do not have plan to deliver it for DNF therefore I am closing it with change of component.


Note You need to log in before you can comment on or make changes to this bug.