Red Hat Bugzilla – Bug 1218415
Don't replace packages in Fedora Updates repository - keep them all
Last modified: 2016-01-11 08:33:01 EST
Description of problem:
"dnf history undo" is crucial many times and I expect possible undo of last transaction at least. However it's (for me totally impossible) not always possible revert at least last transaction, when you have not set paramter "keepcache=1" inside dnf.conf.
However, parameter "keepcache" is missing in dnf.conf by default. So, should be possible ask users during first update (when parameter keepcache miss) if they want to keep cache (and may notice that sometimes they should clean cache manually) or not and warn about possible future problems? After that, set keepcache inside dnf.conf according to chosen option.
It isn't instrusive solution and I think that could be really very helpful for other people and it can save really lots of time of everyone in future.
No. We cannot do warnings for any option in conf. I propose other solution as we have demand to make keepcache=1 by default.
pkgcachelimit option could erase package over limit if set from time to time.
so by default there will be keepcache=1 and pkgcachelimit=<reasonable_value> set
*** Bug 1211891 has been marked as a duplicate of this bug. ***
I think if clearing cache occurs only at the successful completion of a DNF is would be enough and not needed for "keepcache=1" by default.
Is it possible to fix this in time for Fedora 22? If not, is there an easy workaround? (Maybe set keepcache=1 and implement pkgcachelimit later?)
J. Haas: workaround for this is to set "keepcache=1" by yourself. I know there are a lot of users who would not welcome this change without "pkgcachelimit" option. For example you want to maintain docker images minimal.
*** Bug 1220074 has been marked as a duplicate of this bug. ***
*** Bug 1221062 has been marked as a duplicate of this bug. ***
*** Bug 1220666 has been marked as a duplicate of this bug. ***
I'm not too happy about this bug. Can we change the Version to Fedora 22 and mark it as a regression instead of a future enhancement? This is probably important enough for a blocker, but i would really like to see this fixed soon.
I see there is recently demand for this and normal users are expecting the cache being kept. The background of this change is:
* at first it was completely removed
* then the change was reverted and the option set to off by default (bug 1046244)
The other proposal was to implement `keepcache_after_abort` plugin which would kept packages until next successful operation and possibly will be added to comps in Fedora Workstation.
J. Haas: Who says that keeping cache is the proper solution? That way with each installation packages takes twice as much space on the disk. Setting the option by default to `true` would be easy but we want to find proper solution. Half users want it to `true` the rest to `false`...
(In reply to Jan Silhan from comment #11)
> J. Haas: Who says that keeping cache is the proper solution?
I don't. The main issue is that all downloaded packages are thrown away after some minor error happens during download or installation. Details are in all the bugs that have been duplicated against this bug.
Obviously the cache should be cleaned when it's no longer needed (when the installation succeeded or whatever) to not waste huge amaount of disk space. Unless the user explicitly says he want to keep all the cache.
I'm not sure the pkgcachelimit is the right solution, keepcache_after_abort seems to me more useful to me. But I don't want to decide that. I just see, that a lot of users might get angry and frustrated because they will need to redownload huge amounts of updated because of minor installation problems. And yum or aptitude or other package managers handle this case much better.
+1 for keepcache_after_abort solution.
keepcache_after_abort isn't solution (if I understand its behaviour well) when installation is done sucessfully and you find out that updated system/packages are unusable for you - that's why I reported this bug originally.
Clearing cache only after sucessful completion of DNF is pretty peanut solution, which is helpfull only in some cases. 90% cases when I need undo last transaction is after sucessful update.
If you want this as default, ok. But IMO pkgcachelimit has much grater usecase.
There are three requests and I'm afraid we cannot fulfil all of them with a single solution:
* print a warning that if keepcache == 0, Fedora users may not be able to undo transactions
* implement a feature that allows Fedora users undo regardless of the Fedora infrastructure state
* don't re-download (meta)data if a transaction fails
...while it's not acceptable to keep every downloaded package forever...
I think that marking them as duplicates was bit premature.
I see bug 1220074 has been reopened. Maybe move the duplicates and the Fedora 22 discussion there.
I've just upgraded to fc22, and think keepcache=1 should be set by default.
Using yum with many servers on my network, I mounted all of their /var/cache/yum directories to an NFS share on a fileserver to avoid multiple metadata and package downloads.
It worked well. I could update all machines sequentially using a script issuing commands over ssh, and any package was downloaded only once.
DNF does not appear to use the same joined up thinking with a shared /var/cache/dnf.
Without keepcache=1, if any machine cleared the cache afterwards, then another machine would need to redownload it etc.
The other issue appears to be that if one machine is updated and consistent, the next machine to be updated reads the shared cache of the previous, and assumes it is also already updated.
DNF should be able to make use of a shared download and metadata cache, but also track local machine status locally. It doesn't appear to.
AFAIK, /var/cache/dnf is something internal and you are not supposed to access/manipulate it. Maybe you can remove it at most. It's the place where DNF tracks the local machine status locally.
IIUIC,  is going to serve your needs the best.
Then the design of dnf is more flawed than the yum it deprecates!
It is not good for 5 or 50 or 500 linux machines on a network to download the same packages n times.
Each rpm or drpm should be downloaded *once* to be reused among other machines on the network.
If having a really local dir for machine state isn't possible, then local data could be prefixed with $(hostname -s)- on the NFS share, but there would be many files with a 500 machine network.
The dnf cache update service would also need disabling on all but one machine.
I'm not too concerned about what people are "supposed" to do - dictating how programs and users should work is a nono if the paradigm is flawed, regresses a working paradigm and makes workarounds difficult.
Then there's the case of discovering all my complex iptables rules have been replaced and deleted, but I digress. I hope someone will think about these things when upgrading!
Of course no one have ever made a design decision that administrators of 500 machines will be forced to download the same packages n times. The problem is simply that no one have implemented a feature that supports this use case so far. I think that we can track this request in bug 1218415.
I mean bug 1185741.
(In reply to Andrew Haveland-Robinson from comment #21)
> It is not good for 5 or 50 or 500 linux machines on a network to download
> the same packages n times.
> Each rpm or drpm should be downloaded *once* to be reused among other
> machines on the network.
The other side of coin is that you are probably using wrong tool for your task. Have you ever considered to use Pulp or other repository management software?
+1 for keepcache_after_abort solution.
Happens regularly that I have to control-c it, just happened today and cost me half a gigabyte of data to re-download for absolutely no good reason. (and twice because I just could not believe that the package data would have been removed before being used - before I spent yet even more time hunting for this ticket)
> The other side of coin is that you are probably using wrong tool for your task.
Very strongly disagree, for *me*. I have scped / rsynced RPM packages from one box to another to speed up package update / installation. Telling *all* users that they should be using a repository management software is completely overkill and inappropriate. There are valid use cases that are now broken.
(In reply to Antoine Martin from comment #25)
> +1 for keepcache_after_abort solution.
> Happens regularly that I have to control-c it, just happened today and cost
> me half a gigabyte of data to re-download for absolutely no good reason.
> (and twice because I just could not believe that the package data would have
> been removed before being used - before I spent yet even more time hunting
> for this ticket)
+1 for keepcache_after_abort. I had some issue with a file from .x86_64 version of a package conflicting with the .i686 version, and dnf would redownload couple hundred mb of rpms just to delete them every time. If I was on a metered or slow connection I'd be rather unhappy right now.
I've been a sysadmin of RH Linux since 1997, before Fedora was a twinkle in someone's eye!
I tried and maintained a local repository for a while - it was overkill for a small network, complicated, downloads many gigabytes of packages that will likely never be used, is something else that needs attending to, and doesn't solve local cache duplication on destination machines.
In contrast, I've used an NFS shared /var/cache/yum non-concurrently for years and have a cache history going back to fc17 containing only packages that any machine has actually required and downloaded.
(It also allows a 4GB EEE PC to be upgraded to fc21 by freeing up valuable space).
dnf shared caching doesn't work, so I am sticking with yum-deprecated on the two fc22 machines I upgraded, and holding back on upgrading the rest until someone can solve shared caching, without any new machine inadvertently deleting the cache for everyone else!
+1 for keepcache_after_abort
What the hell kind of **** is this to delete GBs of downloaded packages just because of some minor conflict that prevented their installation.
Reassigning to fedora infrastructure to maintain all built packages in Fedora Updates repos. Do not replace them so users can do "dnf/yum history undo" while every single users don't have to locally store copies of previous updates. Why should we duplicate the packages on multiple machines?
FYI in this report is mixed the DNF bug report of not storing packages after unsuccessful transaction. The redownload of packages when the error occurs is tracked in bug 1220074 - this will be fixed in DNF upstream soon.
(In reply to Jan Silhan from comment #30)
> Reassigning to fedora infrastructure to maintain all built packages in
> Fedora Updates repos. Do not replace them so users can do "dnf/yum history
> undo" while every single users don't have to locally store copies of
> previous updates. Why should we duplicate the packages on multiple machines?
> FYI in this report is mixed the DNF bug report of not storing packages after
> unsuccessful transaction. The redownload of packages when the error occurs
> is tracked in bug 1220074 - this will be fixed in DNF upstream soon.
if we kept every single rpm pushed out in the updates repos over the life of a release the mirrors would revolt and we would not be able to ship anything. I do not see anyway to do what is being asked here.
I just faced this issue after installing the v22 64 bit. I tried installing using yum, it redirected me to dnf, I tried to install kde desktop, after some 80% of downloading of 350MB of data, it gave the 'cant find mirrors' error and stopped. When I restarted the installation, it started downloading the packages again. I am on a 512kbps bandwidth and took some 2-3 hrs for that. Keeping my fingers that it doesnt stop again.
*** Bug 1297189 has been marked as a duplicate of this bug. ***