Bug 1279001 - [RFE] Missing dnf --downloaddir option
[RFE] Missing dnf --downloaddir option
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: dnf (Show other bugs)
rawhide
Unspecified Unspecified
low Severity unspecified
: ---
: ---
Assigned To: Jaroslav Mracek
Fedora Extras Quality Assurance
: FutureFeature, Reopened, Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-11-06 19:33 EST by Marek Marczykowski
Modified: 2017-10-04 18:26 EDT (History)
6 users (show)

See Also:
Fixed In Version: dnf-2.6.2-1.fc26 dnf-2.7.2-1.fc27 dnf-2.7.2-1.fc26
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-10-04 10:24:20 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Marek Marczykowski 2015-11-06 19:33:22 EST
Description of problem:
dnf is missing --downloaddir option, useful together with --downloadonly.

Not always `dnf download --destdir` is enough, because it require explicit package list, so doesn't cover the case of downloading "all updates" for example. I have an impression that also dependency resolution works slightly different in that case (at least it was such in case of yum vs yumdownloader).

Generally it is about downloading packages/updates for other (offline) system. In that case, with yum, it was just enough to get rpmdb and yum configuration (especially yum.repos.d) to some networked system, and launch:
yum --installroot=/some/path --config=/some/path/etc/yum.conf --downloadonly --downloaddir=/some/other/path update

Then copy packages from /some/other/path to that offline system and install them (or create local file:// repository to deal with some corner cases). With DNF it is no longer so simple - retrieving the packages is tricky. 

And BTW repository option 'copy_local' was also useful here (mostly for testing) - so even if the repository was local to the downloading machine, all packages were stored in the same place.

Version-Release number of selected component (if applicable):
dnf-0.6.4-7.fc21.noarch

How reproducible:


Steps to Reproduce:
1. dnf --downloadonly --downloaddir=/tmp install gimp
2.
3.

Actual results:
No such command: --downloadonly. Please use /bin/dnf --help
It could be a DNF plugin command.

Expected results:
Downloads packages into /tmp.

Additional info:
Comment 1 Honza Silhan 2015-11-09 08:36:13 EST
We will consider this.
Comment 2 Fedora Admin XMLRPC Client 2016-07-08 05:25:46 EDT
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 3 Jaroslav Mracek 2017-05-31 10:57:59 EDT
The option (--destdir) will be implemented by pull request https://github.com/rpm-software-management/dnf/pull/810
Comment 4 Fedora Update System 2017-07-24 10:50:21 EDT
libdnf-0.9.3-1.fc26 dnf-plugins-core-2.1.3-1.fc26 dnf-2.6.2-1.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-6f4c06b2d7
Comment 5 Fedora Update System 2017-07-25 00:26:25 EDT
dnf-2.6.2-1.fc26, dnf-plugins-core-2.1.3-1.fc26, libdnf-0.9.3-1.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-6f4c06b2d7
Comment 6 Fedora Update System 2017-07-25 12:55:14 EDT
dnf-2.6.2-1.fc26, dnf-plugins-core-2.1.3-1.fc26, libdnf-0.9.3-1.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.
Comment 7 Ali Akcaagac 2017-08-08 06:16:01 EDT
First of all... Thanks for implementing this feature... You've pretty much taken a big burden from our back because we can consider switching to dnf for our infrastructure..

After some tests with --downloaddir, we figured out that the implementation inside dnf is not behaving like the implementation inside yum.

Problem case:

when downloading packages with *yum-deprecated* and using --downloaddir, then yum checks inside the --downloaddir whether the package has already been downloaded. If this is the case then the download of said package is skipped.

dnf doesn't do that. dnf relies on the cache dir inside the --installroot (e.g. if --installroot is not used it defaults to /var/*). The comparison of what's already downloaded is done with the cache directory only.

In our process we use --downloaddir (within yum-deprecated) to make yum-deprecated aware of what's downloaded already (inside the downloaddir) and have it skip re-downloading said package again. We also "yum-deprecated clean all" afterwards, so the entire cache is cleaned from downloaded packages and metadata (preparing it for the next day).

If we do *dnf clean all* and then *dnf install group libreoffice --downloaddir=/opt/test --releasever=26 --downloadonly -y* will re-download all packages again and copies then to the downloaddir... not testing inside the downloaddir, whether the package has been downloaded already (so skipping the download).

We'd like to download only "new" packages and skip those already downloaded inside the downloaddir... not re-downloading 800-1000 mb each day.

The cache has to be cleaned, because we otherwise end up having a dumpster of new and old packages residing in the cache... and now imagine this if we do this for fedora 26, fedora 27, fedora 28, rhel, centos, rpmfusion, copr across multiple releases on one machine, that does these tasks every day...

So please *if possible* before downloading new packages, also take the downloaddir into the comparison... to check whether the package was already downloaded before... skipping it, if already there and downloading if missing (or new).
Comment 8 Jaroslav Mracek 2017-08-08 15:04:36 EDT
Ok, I will try to create some improvement. But what about to create on top of download dir repo, that will have a higher priority. Also the local dnf plugin can help.
Comment 9 Jaroslav Mracek 2017-08-08 15:42:48 EDT
Ok, there is an issue: I can make a redirection that downloaddir will be used as cachedir. It will work like you requested, but I don't like it due:
Same package can be available from different repositories with different checksum (different gpgkey), therefore it will be re-downloaded again. 
If you use conf option keepcache=True, updates will be also available for running machine without needs of installing from cachedir. Just installed updates on machine can be redirected to download dir without need of redownloading. 



I have a diff here:
diff --git a/dnf/base.py b/dnf/base.py
index d398af5..f922bd2 100644
--- a/dnf/base.py
+++ b/dnf/base.py
@@ -1005,11 +1005,15 @@ class Base(object):
         if progress is None:
             progress = dnf.callback.NullDownloadProgress()
 
+        if self.conf.destdir:
+            dnf.util.ensure_dir(self.conf.destdir)
+            self.repos.all().pkgdir = self.conf.destdir
+
         lock = dnf.lock.build_download_lock(self.conf.cachedir, self.conf.exit_on_lock)
         with lock:
             drpm = dnf.drpm.DeltaInfo(self.sack.query().installed(),
                                       progress, self.conf.deltarpm_percentage)
-            remote_pkgs = self._select_remote_pkgs(pkglist)
+            remote_pkgs, local_repository_pkgs = self._select_remote_pkgs(pkglist)
             self._add_tempfiles([pkg.localPkg() for pkg in remote_pkgs])
 
             payloads = [dnf.repo._pkg2payload(pkg, progress, drpm.delta_factory,
@@ -1080,8 +1084,7 @@ class Base(object):
             logger.info(msg, full / 1024 ** 2, real / 1024 ** 2, percent)
 
         if self.conf.destdir:
-            dnf.util.ensure_dir(self.conf.destdir)
-            for pkg in pkglist:
+            for pkg in local_repository_pkgs:
                 location = os.path.join(pkg.repo.pkgdir, os.path.basename(pkg.location))
                 shutil.copy(location, self.conf.destdir)
 
@@ -2253,7 +2256,7 @@ class Base(object):
         if error:
             raise dnf.exceptions.Error(
                 _("Some packages from local repository have incorrect checksum"))
-        return remote_pkgs
+        return remote_pkgs, local_repository_pkgs
Comment 10 Jaroslav Mracek 2017-08-08 15:48:55 EDT
I created a patch (https://github.com/rpm-software-management/dnf/pull/889), but still not sure if this is good idea. Probably you have to convince me and other dnf maintainers and developers.
Comment 11 Ali Akcaagac 2017-08-08 16:39:53 EDT
Thank you for your time and the patch. I will be testing this tomorrow if I find some spare time within the job.

How to convince you and other dnf maintainers...

Good question.

The main thing would be "better compatibility towards yum" - I would say! It would be as if someone changed grep, sed or awk to be totally different and thus breaking things that other people might have build around it. Even Linus Torvalds (to bring him up here) says that compatibility should be kept by all means.

I am not sure how yum-deprecated (followed as "yum") has done this internally. But yum tested the downloaddir if the file has been downloaded already. From the performance I tend to say that yum did not make any checksum verifications.

The local dnf download plugin (last time I checked it) wasn't able to group download neither did it ask for permission to download... It simply downloads...

Coming back to dnf core:

dnf behaves like yum in this regard... the --downloaddir and --downloadonly asks whether you want to proceed or abort (unless you have -y applied as command line). Which is the expected behaviour... So far everything is good...

The issue with different repositories may have same file is indeed a valid concern. Same applies with same named files and different gpg signature...

Maybe adding another parameter to the commandline that - optionally - turns on downloaddir comparison (and if file exist - by same filename only - then skip downloading that file) could be help.

That way you can keep the downloaddir behaviour the way you intended to implement it... and the optional command line, that turns the yum compatibility (behave as yum) on... This option can then be explained (inside the documents) that the person using this option aknowledges, that the files are only compared by name - not checksum and not if different gpg key was used...

Looking at the patch, then it's just a 7 liner... No huge code chunks that needs to be maintained or so... so offering a separate --compatibility option in the regard of --downloaddir to change the behaviour (yum intended or dnf intended) may be an option... So both features can be implemented in a clean nature and easily be maintained...

Just an idea...

Compatibility would always be my first argument... The extra command line argument, would be an - from my point of view - acceptable way to go...

Optionally let dnf downloaddir test for a file named "FIRST" in the downloaddir... If it finds it then it will compare downloaded files and skip if they exist... (another idea in how to retain both ways of handling downloaddir). It would be no problem "touch FIRST" in all our dirs..

But right now... The current implementation... would require us to download aprox. 2660 explicit chosen rpm packages (2.2gb) for Fedora 26 x86_64 each day over and over again... And we deal with Fedora 25, 26, 27, rawhide, RPMFusion, RHEL, CentOS and 2 different architectures every day...

So we end up downloading aprox. 30gb of files over and over and over again and again every day only to filter out the new packages (or we turn off cache cleaning, which then dumps the server)... rather than downloading just the difference (the new ones) which ends up in only a few hundret of megabytes (in worst cases) every day (the way yum deals with it)...

We don't install with dnf... we only resolve and download packages with dnf and store the files in different *flat* download directories for further processing. So we basicly *use* it as a package management, dependency resolver and download tool... as we use yum that way... The use case is a bit different than it might be intended initially...

So may I convince you to *maybe* add another option that accompanies the --downloaddir option to turn the yum behaviour (most likely your 7 lines patch) on - optionally...
Comment 12 Jaroslav Mracek 2017-08-09 04:05:16 EDT
Thanks for explanation.
Comment 13 Ali Akcaagac 2017-08-09 06:38:47 EDT
Coming back to the patch...

I've testet it and yes! It's exactly mimicing the way how yum dealt with it. It would be a big help for us, if you can make it happen somehow - to be accepted within upstream. Even good if it's enabled with an accompanying command line option to --downloaddir (in case you like to have the default be your way).

What we like with dnf in that way is: that dnf first downloads the files (and new files) into the cache... yum did download the files directly into the --downloaddir... This has caused problems if yum got aborted... it left unfinished transactions in the downloaddir called e.g.: mc-4.18.x-1.fc26.rpm.23423.tmp (basicly yum applied it's PID and a tmp to the file until its downloaded). We had to deal with that and added a separate process in our infrastructure to deal with these tmp files in case of an unforeseen break (and where yum finishtransactions failed). With dnf we can even improve our own stuff over here and getting rid of that tmp dealing part.

So, if you can make it happen, then we can finally switch over... With all fingers crossed our own migration should be possible within a few hours... After that only longtime usage can uncover further issues... But we deal with that once it happens... One step after another...
Comment 14 Ali Akcaagac 2017-08-09 12:07:42 EDT
JFYI

We migrated!

1) we patched dnf with your patch (which we depend on)
2) we replaced yum by dnf
3) we had to add --setopts=strict=0 because we use one top modules list for all different rpm based distros and not all of them offer the same packages so skipping missing ones was mandatory

4) everything ran out of the box...

with one exception - which is just cosmetical nature...

yum has shown the real mb that has to be downloaded (because it skipped the packages and thus the real download size changed) whereas dnf shows how much it resolved and the total amount of downloaded mb's... but we can live with that...

what matters was the technical underlaying functionality of grabbing and distributing the packages to its target directories... and that works like a charm... now we'd really like to see this patch to be merged (somehow) into upstream... we'll keep the yum scripts in case of a case ;)
Comment 15 Jaroslav Mracek 2017-08-10 03:07:18 EDT
Thanks for testing. I made a fixup that should solve the problem with incorrect download size. Hope that it helped.
Comment 16 Ali Akcaagac 2017-08-10 06:07:48 EDT
We just applied the other patches found here:

https://github.com/rpm-software-management/dnf/pull/889/commits

base.py, cli.py

... and confirm that this is the closest compatibility it can get towads the behaviour of yum --downloaddir. Even the calculation of what has to be downloaded is correct. Even the way it's presented is correct.

We are running our system for the 2nd day now and haven't had any issues. We were able to remove our old tmp files handling... and even got rid of our blacklisting packages that had some weak dependency issues which were not covered by yum.

Btw: If you still feel unsure or don't want do add yet another command line option, then we would also be thankful if this can be made available via --setopt=compat=1 (for example). Or simply commit it as it is because it's the closest compatibility it can get... You even make sure not to overwrite peoples files without asking - since they are skipped :)

Anyways... Thank you...
Comment 17 Jaroslav Mracek 2017-08-11 03:23:19 EDT
Thanks for testing and comments. Now I fully support the merge of the new implementations.
Comment 18 Ali Akcaagac 2017-08-18 12:02:49 EDT
Giving further feedback. We are now running our *entire* infrastructure for over one week with these changes now. No single issues found.

Thanks again.
Comment 19 Fedora Update System 2017-10-02 06:36:06 EDT
dnf-plugins-extras-2.0.3-1.fc27 dnf-plugins-core-2.1.4-1.fc27 dnf-2.7.2-1.fc27 libdnf-0.10.1-1.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2017-faf235c683
Comment 20 Fedora Update System 2017-10-02 06:38:41 EDT
dnf-plugins-extras-2.0.3-1.fc26 dnf-plugins-core-2.1.4-1.fc26 dnf-2.7.2-1.fc26 libdnf-0.10.1-1.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-70a8618065
Comment 21 Fedora Update System 2017-10-02 16:29:18 EDT
dnf-2.7.2-1.fc26, dnf-plugins-core-2.1.4-1.fc26, dnf-plugins-extras-2.0.3-1.fc26, libdnf-0.10.1-1.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-70a8618065
Comment 22 Fedora Update System 2017-10-02 17:28:29 EDT
dnf-2.7.2-1.fc27, dnf-plugins-core-2.1.4-1.fc27, dnf-plugins-extras-2.0.3-1.fc27, libdnf-0.10.1-1.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-faf235c683
Comment 23 Fedora Update System 2017-10-04 10:24:20 EDT
dnf-2.7.2-1.fc27, dnf-plugins-core-2.1.4-1.fc27, dnf-plugins-extras-2.0.3-1.fc27, libdnf-0.10.1-1.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.
Comment 24 Fedora Update System 2017-10-04 18:26:13 EDT
dnf-2.7.2-1.fc26, dnf-plugins-core-2.1.4-1.fc26, dnf-plugins-extras-2.0.3-1.fc26, libdnf-0.10.1-1.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.