Bug 2132383

Summary: reposync produces wrong metadata when using both -n and --download-metadata options
Product: Red Hat Enterprise Linux 8 Reporter: Ondrej <ondrej.valousek>
Component: dnf-plugins-coreAssignee: Jaroslav Mracek <jmracek>
Status: CLOSED ERRATA QA Contact: Eva Mrakova <emrakova>
Severity: low Docs Contact:
Priority: low    
Version: 8.5CC: james.antill, mbanas, mblaha, nsella, rmetrich
Target Milestone: rcKeywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: dnf-plugins-core-4.0.21-21.el8 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-14 15:49:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ondrej 2022-10-05 13:23:19 UTC
Description of problem:
When running reposync with both options -n and --download-metadata, it downloads only newest package versions but full metadata that contain all remote package versions. This is very confusing.

Version-Release number of selected component (if applicable):
yum-utils-1.1.31-54.el7_8.noarch


How reproducible:
always

Steps to Reproduce:
1. reposync -n --download-metadata ...
2. dnf list --show-duplicates
3.

Actual results:
dnf produces a list of packages we never download

Expected results:
dnf only produces a list of packages we really downloaded

Additional info:
I'd expect reposync either complain about using both options at the same time or fix the downloaded metadata so that it only contains the newest packages

Comment 1 Marek Blaha 2022-10-11 06:32:30 UTC
The thing is that dnf does not alter downloaded repository metadata in any way. So we cannot create a subset of metadata with only the newest packages.
But I agree that the current behavior is confusing. There are two solutions I can think of:
- make `-n` and `--download-metadata` options mutually exclusive and error out when used together (I'd prefer this). User then must use createrepo_c to make a repo from downloaded packages.
- just print warning for the user that the repo is not directly usable (but reposync prints a lot of output and the warning might get lost in it)

Comment 2 Renaud Métrich 2022-10-11 06:56:19 UTC
The behavior is confusing, but from my experiments I don't see any issue with having too many informations in the metadata, this doesn't seem to affect package installation at all.

Comment 3 Renaud Métrich 2022-10-11 07:10:39 UTC
Hi Marek,

The real issue is more with `reposync -n --download-metadata` not "fixing" the metadata to reflect what is currently in the repository.

For example, let's assume "glib2" is currently installed on the system at level 2.56.4-156.el8 and latest available is "glib2-2.56.4-158.el8".
After doing a "reposync ...", installing "gimp-devel" package will fail:
~~~
# dnf install --disablerepo="*" --repofrompath mntos,file:///mnt/os --repofrompath mntapp,file:///mnt/app-os gimp-devel
...
Installing:
 gimp-devel                    x86_64          2:2.8.22-15.module+el8+2760+3d7d61b2             mntapp          940 k
Installing dependencies:
...
 glib2-devel                   x86_64          2.56.4-156.el8                                   mntos           424 k
...

Is this ok [y/N]: y
Downloading Packages:
Error opening /mnt/os/Packages/g/glib2-devel-2.56.4-156.el8.x86_64.rpm: No such file or directory
~~~

This is because "glib2-devel-2.56.4-156.el8" is not present in the repository (since it only has latest).
Upon resolving the transaction, because "glib2-2.56.4-156.el8" is already installed, dnf will try to install corresponding "glib2-devel-2.56.4-156.el8" which doesn't exist in the repository.

NOW, if you update the metadata, the issue disappears because dnf is smart enough to update "glib2" to solve the dependencies:
~~~
# createrepo --update /mnt/os
Directory walk started
Directory walk done - 1789 packages
...

# rm -fr /var/cache/dnf

# dnf install --disablerepo="*" --repofrompath mntos,file:///mnt/os --repofrompath mntapp,file:///mnt/app-os gimp-devel
...
Installing:
 gimp-devel                    x86_64          2:2.8.22-15.module+el8+2760+3d7d61b2             mntapp          940 k
Upgrading:
 glib2                         x86_64          2.56.4-158.el8                                   mntos           2.5 M
Installing dependencies:
...

Is this ok [y/N]: y
Downloading Packages:
Running transaction check
Transaction check succeeded.
...
~~~

This is where improvement can happen: upon using "-n --download-metadata", an automatic "createrepo --update" should be performed, or a message telling the user to do so printed.

Renaud.

Comment 4 Ondrej 2022-10-11 07:41:17 UTC
Thanks Renaud,
Yes I'd vote for spawning an extra 'createrepo --update' process when using both '-n --download-metadata' options, rather than bailing out completely here.
It would be much more end-user friendly here.
My 5 cents.

Comment 5 Marek Blaha 2022-10-11 08:36:32 UTC
Thanks guys for adding the third option. I'm just wondering if it wouldn't be better to have separate explicit option for spawning createrepo on downloaded content (e.g. `--createrepo`). I'm worried about changing the reposync behavior in the middle of RHEL 8 life cycle. Adding a new option will be safer from this point of view.

Comment 6 Ondrej 2022-10-12 08:18:24 UTC
Well, I think it might have some sense but then it is questionable if it still makes a sense to keep the --download-metadata option.
I am using --download-metadata (rather than running createrepo) because I am do not want to lose information about groups/modules etc (not sure if createrepo can fully recreate repository w/o losing any such information).
Maybe it would make a sense to introduce new option "--createrepo" to spawn "createrepo --update" when flag --download-metadata is also used, otherwise we would only spawn "createrepo" to create a fresh new metadata. Does it make a sense?

Comment 10 Jaroslav Mracek 2023-03-27 08:29:09 UTC
I created a PR with update of documentation - https://github.com/rpm-software-management/dnf-plugins-core/pull/483

Comment 16 errata-xmlrpc 2023-11-14 15:49:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (dnf-plugins-core bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:7125