RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1931904 - reposync re-downloads packages with multiple hardlinks
Summary: reposync re-downloads packages with multiple hardlinks
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: librepo
Version: 8.3
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: rc
: ---
Assignee: Marek Blaha
QA Contact: Jan Blazek
URL:
Whiteboard:
: 1929274 (view as bug list)
Depends On: 1951407
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-02-23 14:28 UTC by Josef Kubin
Modified: 2024-06-14 00:27 UTC (History)
4 users (show)

Fixed In Version: librepo-1.14.0-1.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-11-09 19:45:07 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2021:4429 0 None None None 2021-11-09 19:45:18 UTC

Description Josef Kubin 2021-02-23 14:28:09 UTC
Description of problem:

The reposync re-downloads files that have hardlinks outside the repo directory.

For example: The exact same RPM file perf-debuginfo-2.6.32-754.36.1.el6.x86_64.rpm exists in both rhel-6-server-els-optional-debug-rpms and rhel-6-server-els-debug-rpms.
So if we de-duplicate the 2 files using any utility (rdfind, cp, hardlink etc.), the file seen by repo rhel-6-server-els-optional-debug-rpms and rhel-6-server-els-debug-rpms would be the same file with multiple hardlinks.
At this point, if the repo metadata expected different attributes of the file (modification date, sha256sum etc.) and somehow due to bug in dnf reposync, it finds the existing file not matching one of the criteria, it would re-download that file every time.

We did not have this bug till EL7 yum reposync.

List of extended file attributes of the mentioned package that is being downloaded again:
~~~
# getfattr --dump Packages/p/perf-debuginfo-2.6.32-754.36.1.el6.x86_64.rpm
# file: Packages/p/perf-debuginfo-2.6.32-754.36.1.el6.x86_64.rpm
user.Zif.MdChecksum[1610935587]="769f719235d607f4406aa33ced7f27c76f5db4c0"
user.Zif.MdChecksum[1613801719]="cf7ebf419e892ae6f0df7c8fcb3fe538844d9553f0a1b523d8addd8faa5c0111"
user.Zif.MdChecksum[1613985221]="cf7ebf419e892ae6f0df7c8fcb3fe538844d9553f0a1b523d8addd8faa5c0111"
user.Zif.MdChecksum[1613990826]="cf7ebf419e892ae6f0df7c8fcb3fe538844d9553f0a1b523d8addd8faa5c0111"
user.Zif.MdChecksum[1613990960]="cf7ebf419e892ae6f0df7c8fcb3fe538844d9553f0a1b523d8addd8faa5c0111"
user.Zif.MdChecksum[1613992052]="cf7ebf419e892ae6f0df7c8fcb3fe538844d9553f0a1b523d8addd8faa5c0111"
~~~

Mount attributes:
~~~
# mount
/dev/mapper/vg_data-lv_var_mrepo on /var/mrepo type ext4 (rw,nodev,relatime,seclabel)
~~~

Version-Release number of selected component (if applicable):
librepo-1.12.0-2.el8.x86_64
libdnf-0.48.0-5.el8.x86_64
dnf-4.2.23-4.el8.noarch
dnf-plugins-core-4.0.17-5.el8.noarch

How reproducible:
In the customer's environment.

Actual results:

Each time we run the `reposync` on the same repository, the command repeatedly downloads packages which are hardlink-ed.

Expected results:

The packages are downloaded only once no matter whether are hardlink-ed or not.

Comment 1 Marek Blaha 2021-03-01 10:18:57 UTC
I've created patch that fixes the problem - https://github.com/rpm-software-management/librepo/pull/232

Comment 2 Marek Blaha 2021-04-12 11:17:25 UTC
PR with tests: https://github.com/rpm-software-management/ci-dnf-stack/pull/972

Comment 12 errata-xmlrpc 2021-11-09 19:45:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (librepo bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4429

Comment 13 Jaroslav Mracek 2022-03-15 11:34:35 UTC
*** Bug 1929274 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.