Red Hat Bugzilla – Bug 1255526
dnf-1.0.2-1 and later removes the old DNF's cache
Last modified: 2015-08-25 10:38:10 EDT
Description of problem:
Experimenting with F22 and dnf shows that putting keepcache=true into the dnf.conf file doesn't work. The structure that is built under /var/cache/dnf/x86_64/22 while it is bringing down rpm's from various repos disappears entirely once the dnf update is completed. Some rpm's end up in a /var/cache/PackageKit area but represents only about 20% of all the rpm's that were downloaded.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.Place keepcache=true into dnf.conf
2.Run dnf update for the first time on a brand new O/S install. That should bring down about 600 rpm's.
3.Monitor the rpm's being downloaded into the structure under /var/cache/dnf/x86_64/22
4.Once the downloads are completed, and actual updating starts, the rpm's are there through the installing/updating stage and I believe also during the cleanup stage. By the start of the verifying stage, the structure and all the rpm's disappear.
Downloaded rpm's should remain available so they can be shipped to other machines on the local network without using internet resources.
This is important to our research efforts here in the Caribbean. We have limited and expensive bandwidth and with the yum system we would only download rpm's to one master machine (keepcache=1) and then rsync its cached rpm's to other machines on the local network, so that when they issued a yum update, the bulk of what they needed was in their cache already and internet bandwidth was only used for minor differences that the master machine didn't have.
Research vessels spend weeks at sea and only have internet access via satellite. To update dozens of machines on board where each has to download 99% of the same files consumes bandwidth we don't have and eats up money better spent on research.
I wanted to rewrite our yumrsync script to conform to the new dnf directory hierarchy, but can't get dnf to even save the rpm's for the master dnf update. I actually have no idea if using our rsync script is even possible under dnf, but I hope it is.
In case there are any developers/maintainers listening:
The average business has the bulk of their machines configured alike. I usually create a master machine (mine) and then set up new machines for others, to mimic what I believe their environment should look like.
In the yum days, I wrote a script (yumrsync) that would copy rpm's from a master machine's cache (usually mine) to any target machine's cache. When that target machine would issue a yum update, the bulk of what it could possibly want was already located in its local cache, so 99 times out of 100 it didn't use the Internet connection to download any rpm's. This relied on the yum.conf keepcache=1 setting on all machines.
Experimenting with dnf on Fedora 22 shows that keepcache=true in the dnf.conf file doesn't work. At the end of a dnf update, the rpm's that were downloaded disappear. Some of the rpm's (about 20%) end up at /var/PackageKit/... but they're useless without the rest of the missing rpm's.
Personally, I'd settle for the ability to upgrade my yumrsync script to work against the new dnf directory structure, but I can't even get dnf to save the rpm's. Maybe PackageKit it the culprit - I don't know.
For the developers/maintainers, please consider the following.
The entire world doesn't have cheap, high bandwidth Internet access. Forcing all dnf update's to suck down rpm's that were already brought down by another local machine is a waste of a precious resource and costs money many can't spare. What's needed isn't reposync, but the concept of a composite repo capability that allows one local machine to be the master repo for any rpm's other local machines could possibly want. That local master repo hold the rpm's it got from various real repo's. It may have brought down rpm's from fedora, fedora-updates, google-chrome, rpmfusion, etc.
Other machines on the local network should look at the master machine for any rpm's they may need regardless of the original repo that may be the source. Only rpm's that the master machine doesn't possess should follow the traditional path to a real repo to obtain them via the Internet. The reposync utility may be good for General Motors, but not for us.
We have research vessels that don't hit port for weeks and rely on limited bandwidth and expensive satellite connections to move Internet packets. When a laptop falls overboard (don't ask), a new one has to be spun up quickly to take its place. Waiting for hours to download rpm's that were already downloaded in the past is unacceptable. We need a mechanism to save rpm's and make them available to the new machine regardless of which repo they came from. We need a local composite repo.
I cannot reproduce it. keepcache=true works for me. Can you please upgrade your DNF and retest it? Please note that recently, DNF have changed the structure of the cache. There are no "x86_64" and "22" subdirectories now.
FYI, we are working on improving the situation around different efforts to share the RPM cache between multiple devices (and many other use cases). There is a new library  which should help once we integrate DNF with it . Hopefully, once it is done, all sysadmins could stop maintaining the thousands of similar cache-sharing scripts and simply let DNF to do it for them. Well, in case of your vessels, you are going to need a script anyway since I assume the systems at sea has no cheap connection even to your master machine so the script would still have to synchronize the caches when they are nearby (others would just point the slave machines to the master one) but the script would be a bit simpler I guess.
You must not be doing what I'm doing as I've installed F22 on a new box at least 10 times now and I get the same result every time - the cache is gone. Just last night, the test box displayed a message saying new software was available, and I did a dnf update to see what would happen, and the 5 files that came down did remain in the cache. So, there's a difference between how dnf functions on the initial install and how it works thereafter when it has probably been replaced by an updated version. I need the 500+ files that come down after a new install.
I'm installing from from scratch using the workstation iso - Fedora-Live-Workstation-x86_64-22-3.iso . With it, I have no option to upgrade to a new dnf as it is what it is.
BTW - A master is kept aboard the vessel, and it is used to get its updates the normal way by using a satellite connection, usually at night. From that master, someone uses my script to distribute rpm's to any and all machines that run Fedora.
OK, I probably know what happened. Unfortunately, I cannot confirm it now. As I said, we have changed the structure of the cache. Since the current DNF is not able to work with the old cache, we decided to remove the old one. This is done since dnf-1.0.2-1. So, every time you install dnf-1.0.2-1 and later, it removes /var/cache/dnf/x86_64 . Because in your case, the transaction contains an upgrade to dnf-1.1.0-2, the cache is removed in the end. But once you upgrade to dnf-1.0.2-1 or later, keepcache starts working again.
One way to reduce the negative effect of this change would be to do "dnf upgrade dnf" first and then upgrade the rest ("dnf upgrade"). This way, the remaining packages should stay in the cache.
I apologize for this inconvenience. Do you need a fix or can you live with that?
I'll try your suggestion. I'll do an absolutely fresh install and then immediately do the "dnf upgrade dnf" to see what happens. Then I'll bounce the box and do a dnf upgrade and see how that turns out.
The download of the 500+ rpm's takes about 4 hours on our Internet soda straw, so I'll report back later today or tonight.
I did a fresh F22 install.
Fixed dnf.conf to add keepcache=true.
Ran dnf upgrade dnf.
Bounced the box to make sure no old libraries were loaded.
Checked and there was a x86_64/22 directory.
Checked and my keepcache=true survived the upgrade.
Ran dnf makecache to try to force it to get rid of the bogus directories, but that did nothing. x86_64/22 were still there.
Ran dnf install emacs as a minor test due to my limited bandwidth.
x86_64/22 directories are still there and the rpm's that came down for emacs are where they should be, so now I'll run the dnf upgrade to bring down the 500+ rpm's that I really want.
Next, I'll see if seeding another box's packages directories with the 500+ rpm's will work as it used to under yum. I'll move them manually before I waste time scripting it in case that doesn't work.
I'll keep you posted.
Well, the directory is removed immediately when the new DNF is installed. It's part of the installation script. But it seems that the old DNF, which does the upgrade, is able to create the directory again before it ends but it writes just one JSON file there.
The important thing is that after "dnf upgrade dnf", the /var/cache/dnf/x86_64/22 directory does not contain the downloaded packages.
It's possible that after the emacs upgrade, the directory stayed there. It will probably stay there for ever (might be considered a bug). But the important things are happening in the new /var/cache/dnf/fedora-* and /var/cache/dnf/updates-* directories. There, you should have seen the new emacs packages.
Spun up another box and went through the previously specified steps omitting the makecache as it did nothing. Of those steps, installing anything tiny like emacs is an important step because it forces the creation of the "packages" directories that hold the cached files. I use emacs as my editor, so that was the obvious thing for me to install, but anything small will do.
I scp'd the rpm's from the first box to the equivalent locations on the second box.
scp /var/cache/dnf/fedora-d174f3c3f2691dd5/packages/* brick10:/var/cache/dnf/fedora-d174f3c3f2691dd5/packages
scp /var/cache/dnf/updates-d28e3be95240972f/packages/* brick10:/var/cache/dnf/updates-d28e3be95240972f/packages
I noticed that the directory names on the first box were identical to the ones on the second box. I was expecting them to be random, but they weren't.
Ran dnf upgrade on the new box, and it took off using the seeded rpm's. GREAT!
So, for now all I have to do is update our procedures to dnf upgrade dnf, followed by dnf install something.small before I scp the cached rpm's from the master box to a target box and then the target box can do a dnf upgrade and it won't suck down the files in its cache, which was the whole point of my experiment.
Thank you for the help in identifying what was going on. I would never have figured out that one version of dnf was nuking an older versions structure.
I haven't checked the reference you provide for what's being contemplated to create a mechanism to replace everyone's rpm move scripts like my yumrsync, but it occurred to me that a web server proxy was the model to follow.
Any box that issues a dnf should be routed thru a dnf proxy server. That proxy server caches absolutely everything any box brings down. Every other box that requests the same rpm gets it from the proxy cache. Problem solved, and its a fairly elegant solution.
Thanks for sharing your solution, Bill. I consider this case resolved then -> closing this report if you don't mind.