Bug 1255526 - dnf-1.0.2-1 and later removes the old DNF's cache
Summary: dnf-1.0.2-1 and later removes the old DNF's cache
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: dnf
Version: 22
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Packaging Maintenance Team
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-08-20 20:12 UTC by Bill Gradwohl
Modified: 2015-08-25 14:38 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-08-25 14:38:10 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Bill Gradwohl 2015-08-20 20:12:12 UTC
Description of problem:
Experimenting with F22 and dnf shows that putting keepcache=true into the dnf.conf file doesn't work. The structure that is built under /var/cache/dnf/x86_64/22 while it is bringing down rpm's from various repos disappears entirely once the dnf update is completed. Some rpm's end up in a /var/cache/PackageKit area but represents only about 20% of all the rpm's that were downloaded.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.Place keepcache=true into dnf.conf
2.Run dnf update for the first time on a brand new O/S install. That should bring down about 600 rpm's.
3.Monitor the rpm's being downloaded into the structure under /var/cache/dnf/x86_64/22
4.Once the downloads are completed, and actual updating starts, the rpm's are there through the installing/updating stage and I believe also during the cleanup stage. By the start of the verifying stage, the structure and all the rpm's disappear.
Actual results:


Expected results:
Downloaded rpm's should remain available so they can be shipped to other machines on the local network without using internet resources.

Additional info:
This is important to our research efforts here in the Caribbean. We have limited and expensive bandwidth and with the yum system we would only download rpm's to one master machine (keepcache=1) and then rsync its cached rpm's to other machines on the local network, so that when they issued a yum update, the bulk of what they needed was in their cache already and internet bandwidth was only used for minor differences that the master machine didn't have.
Research vessels spend weeks at sea and only have internet access via satellite. To update dozens of machines on board where each has to download 99% of the same files consumes bandwidth we don't have and eats up money better spent on research.
I wanted to rewrite our yumrsync script to conform to the new dnf directory hierarchy, but can't get dnf to even save the rpm's for the master dnf update. I actually have no idea if using our rsync script is even possible under dnf, but I hope it is.

Comment 1 Bill Gradwohl 2015-08-20 21:24:11 UTC
In case there are any developers/maintainers listening:

The average business has the bulk of their machines configured alike. I usually create a master machine (mine) and then set up new machines for others, to mimic what I believe their environment should look like. 

In the yum days, I wrote a script (yumrsync) that would copy rpm's from a master machine's cache (usually mine) to any target machine's cache. When that target machine would issue a yum update, the bulk of what it could possibly want was already located in its local cache, so 99 times out of 100 it didn't use the Internet connection to download any rpm's. This relied on the yum.conf keepcache=1 setting on all machines.

Experimenting with dnf on Fedora 22 shows that keepcache=true in the dnf.conf file doesn't work. At the end of a dnf update, the rpm's that were downloaded disappear. Some of the rpm's (about 20%) end up at /var/PackageKit/... but they're useless without the rest of the missing rpm's.

Personally, I'd settle for the ability to upgrade my yumrsync script to work against the new dnf directory structure, but I can't even get dnf to save the rpm's. Maybe PackageKit it the culprit - I don't know.

For the developers/maintainers, please consider the following.
The entire world doesn't have cheap, high bandwidth Internet access. Forcing all dnf update's to suck down rpm's that were already brought down by another local machine is a waste of a precious resource and costs money many can't spare.  What's needed isn't reposync, but the concept of a composite repo capability that allows one local machine to be the master repo for any rpm's other local machines could possibly want. That local master repo hold the rpm's it got from various real repo's. It may have brought down rpm's from fedora, fedora-updates, google-chrome, rpmfusion, etc. 

Other machines on the local network should look at the master machine for any rpm's they may need regardless of the original repo that may be the source. Only rpm's that the master machine doesn't possess should follow the traditional path to a real repo to obtain them via the Internet. The reposync utility may be good for General Motors, but not for us. 

We have research vessels that don't hit port for weeks and rely on limited bandwidth and expensive satellite connections to move Internet packets. When a laptop falls overboard (don't ask), a new one has to be spun up quickly to take its place. Waiting for hours to download rpm's that were already downloaded in the past is unacceptable. We need a mechanism to save rpm's and make them available to the new machine regardless of which repo they came from. We need a local composite repo.

Comment 2 Radek Holy 2015-08-21 06:21:17 UTC
Hello,

I cannot reproduce it. keepcache=true works for me. Can you please upgrade your DNF and retest it? Please note that recently, DNF have changed the structure of the cache. There are no "x86_64" and "22" subdirectories now.



FYI, we are working on improving the situation around different efforts to share the RPM cache between multiple devices (and many other use cases). There is a new library [1] which should help once we integrate DNF with it [2]. Hopefully, once it is done, all sysadmins could stop maintaining the thousands of similar cache-sharing scripts and simply let DNF to do it for them. Well, in case of your vessels, you are going to need a script anyway since I assume the systems at sea has no cheap connection even to your master machine so the script would still have to synchronize the caches when they are nearby (others would just point the slave machines to the master one) but the script would be a bit simpler I guess.

[1] https://github.com/james-antill/CAShe
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1185741

Comment 3 Bill Gradwohl 2015-08-21 14:14:15 UTC
Radek:

You must not be doing what I'm doing as I've installed F22 on a new box at least 10 times now and I get the same result every time - the cache is gone. Just last night, the test box displayed a message saying new software was available, and I did a dnf update to see what would happen, and the 5 files that came down did remain in the cache. So, there's a difference between how dnf functions on the initial install and how it works thereafter when it has probably been replaced by an updated version. I need the 500+ files that come down after a new install.

I'm installing from from scratch using the workstation iso - Fedora-Live-Workstation-x86_64-22-3.iso . With it, I have no option to upgrade to a new dnf as it is what it is.

BTW - A master is kept aboard the vessel, and it is used to get its updates the normal way by using a satellite connection, usually at night. From that master, someone uses my script to distribute rpm's to any and all machines that run Fedora.

Comment 4 Radek Holy 2015-08-21 15:57:14 UTC
OK, I probably know what happened. Unfortunately, I cannot confirm it now. As I said, we have changed the structure of the cache. Since the current DNF is not able to work with the old cache, we decided to remove the old one. This is done since dnf-1.0.2-1. So, every time you install dnf-1.0.2-1 and later, it removes /var/cache/dnf/x86_64 [1]. Because in your case, the transaction contains an upgrade to dnf-1.1.0-2, the cache is removed in the end. But once you upgrade to dnf-1.0.2-1 or later, keepcache starts working again.

One way to reduce the negative effect of this change would be to do "dnf upgrade dnf" first and then upgrade the rest ("dnf upgrade"). This way, the remaining packages should stay in the cache.

I apologize for this inconvenience. Do you need a fix or can you live with that?

[1] https://github.com/rpm-software-management/dnf/commit/a6a0be26ce80030046146231c1a299019034f38c

Comment 5 Bill Gradwohl 2015-08-21 16:19:03 UTC
I'll try your suggestion. I'll do an absolutely fresh install and then immediately do the "dnf upgrade dnf" to see what happens. Then I'll bounce the box and do a dnf upgrade and see how that turns out.

The download of the 500+ rpm's takes about 4 hours on our Internet soda straw, so  I'll report back later today or tonight.

Comment 6 Bill Gradwohl 2015-08-21 18:49:46 UTC
I did a fresh F22 install.
Fixed dnf.conf to add keepcache=true.
Ran dnf upgrade dnf.
Bounced the box to make sure no old libraries were loaded.
Checked and there was a x86_64/22 directory.
Checked and my keepcache=true survived the upgrade.
Ran dnf makecache to try to force it to get rid of the bogus directories, but that did nothing. x86_64/22 were still there.
Ran dnf install emacs as a minor test due to my limited bandwidth.
x86_64/22 directories are still there and the rpm's that came down for emacs are where they should be, so now I'll run the dnf upgrade to bring down the 500+ rpm's that I really want.

Next, I'll see if seeding another box's packages directories with the 500+ rpm's will work as it used to under yum. I'll move them manually before I waste time scripting it in case that doesn't work.

I'll keep you posted.

Comment 7 Radek Holy 2015-08-21 19:20:14 UTC
Well, the directory is removed immediately when the new DNF is installed. It's part of the installation script. But it seems that the old DNF, which does the upgrade, is able to create the directory again before it ends but it writes just one JSON file there.

The important thing is that after "dnf upgrade dnf", the /var/cache/dnf/x86_64/22 directory does not contain the downloaded packages.

It's possible that after the emacs upgrade, the directory stayed there. It will probably stay there for ever (might be considered a bug). But the important things are happening in the new /var/cache/dnf/fedora-* and /var/cache/dnf/updates-* directories. There, you should have seen the new emacs packages.

Comment 8 Bill Gradwohl 2015-08-21 22:54:19 UTC
Spun up another box and went through the previously specified steps omitting the makecache as it did nothing. Of those steps, installing anything tiny like emacs is an important step because it forces the creation of the "packages" directories that hold the cached files. I use emacs as my editor, so that was the obvious thing for me to install, but anything small will do.

I scp'd the rpm's from the first box to the equivalent locations on the second box. 
i.e.
scp /var/cache/dnf/fedora-d174f3c3f2691dd5/packages/* brick10:/var/cache/dnf/fedora-d174f3c3f2691dd5/packages
scp /var/cache/dnf/updates-d28e3be95240972f/packages/* brick10:/var/cache/dnf/updates-d28e3be95240972f/packages

I noticed that the directory names on the first box were identical to the ones on the second box. I was expecting them to be random, but they weren't. 

Ran dnf upgrade on the new box, and it took off using the seeded rpm's. GREAT!

So, for now all I have to do is update our procedures to dnf upgrade dnf, followed by dnf install something.small before I scp the cached rpm's from the master box to a target box and then the target box can do a dnf upgrade and it won't suck down the files in its cache, which was the whole point of my experiment.

Thank you for the help in identifying what was going on. I would never have figured out that one version of dnf was nuking an older versions structure.

I haven't checked the reference you provide for what's being contemplated to create a mechanism to replace everyone's rpm move scripts like my yumrsync, but it occurred to me that a web server proxy was the model to follow.

Any box that issues a dnf should be routed thru a dnf proxy server. That proxy server caches absolutely everything any box brings down. Every other box that requests the same rpm gets it from the proxy cache. Problem solved, and its a fairly elegant solution.

Thanks again.

Comment 9 Honza Silhan 2015-08-25 14:38:10 UTC
Thanks for sharing your solution, Bill. I consider this case resolved then -> closing this report if you don't mind.


Note You need to log in before you can comment on or make changes to this bug.