Bug 1336404 - PackageKit downloads are order of magnitude slower than DNF
Summary: PackageKit downloads are order of magnitude slower than DNF
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: PackageKit
Version: 24
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Richard Hughes
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Keywords:
Depends On:
Blocks: 1308538
TreeView+ depends on / blocked
 
Reported: 2016-05-16 11:55 UTC by Kamil Páral
Modified: 2016-07-20 23:50 UTC (History)
12 users (show)

(edit)
Clone Of:
(edit)
Last Closed: 2016-07-20 23:50:11 UTC


Attachments (Terms of Use)
dnf.log (19.06 KB, text/plain)
2016-05-16 11:57 UTC, Kamil Páral
no flags Details
dnf.librepo.log (134.84 KB, text/plain)
2016-05-16 11:58 UTC, Kamil Páral
no flags Details
packagekitd output (111.01 KB, text/plain)
2016-05-16 11:58 UTC, Kamil Páral
no flags Details
rpm -qa (48.74 KB, text/plain)
2016-05-16 11:59 UTC, Kamil Páral
no flags Details
network utilization (29.19 KB, image/png)
2016-06-06 14:17 UTC, Kamil Páral
no flags Details
cpu utilization (45.43 KB, image/png)
2016-06-06 15:25 UTC, Kamil Páral
no flags Details

Description Kamil Páral 2016-05-16 11:55:03 UTC
Description of problem:
When there's a lot of rpm files to download, PackageKit tends to be an order of magnitude slower (this is not a figurative speed) than DNF to download the packages.

I tested on a default Fedora 24 installation by installing frozen bubble (which pulls 54 dependencies). It takes a reasonable amount of time and it easy to demonstrate the issue with. Of course you can test with other packages bring in a lot of (ideally small) dependencies.

Warmup:
$ sudo dnf repolist
$ pkcon refresh

Test phase:
$ time sudo dnf install -y --downloadonly frozen-bubble
real	0m12.007s
user	0m2.300s
sys	0m0.678s
$ time pkcon install -d frozen-bubble
real	2m42.234s
user	0m0.034s
sys	0m0.020s

Cleanup:
$ sudo dnf clean packages
$ sudo find /var/cache/PackageKit/ -name '*.rpm' -exec rm -v '{}' \;


There results are reproducible in different locations (different networks). At the time of reporting this bug and running these commands, I have a 50MB/s network connection.

This problem is making it very annoying to install packages with many deps using gnome-software, and it's making it extremely long to use the graphical upgrades feature, because there are thousands of packages to be downloaded, and it takes hours in gnome-software, compared to 10-15 minutes in DNF.


Version-Release number of selected component (if applicable):
PackageKit-1.1.0-1.fc24.x86_64
libhif-0.2.2-3.fc24.x86_64
librepo-1.7.18-2.fc24.x86_64
dnf-1.1.8-1.fc24.noarch
hawkey-0.6.2-4.fc24.x86_64
gnome-software-3.20.1-1.fc24.x86_64

How reproducible:
always for me

Steps to Reproduce:
1. test e.g. with frozen-bubble, as shown in description

Comment 1 Kamil Páral 2016-05-16 11:57 UTC
Created attachment 1157864 [details]
dnf.log

This is for the dnf download transaction.

Comment 2 Kamil Páral 2016-05-16 11:58 UTC
Created attachment 1157865 [details]
dnf.librepo.log

This is for the dnf download transaction.

Comment 3 Kamil Páral 2016-05-16 11:58 UTC
Created attachment 1157866 [details]
packagekitd output

This is for the PackageKit download transaction. The output was gathered as:
$ sudo killall packagekitd
$ sudo /usr/libexec/packagekitd --disable-timer --verbose 2>&1 | tee packagekitd.out

Comment 4 Kamil Páral 2016-05-16 11:59 UTC
Created attachment 1157867 [details]
rpm -qa

Comment 5 Kamil Páral 2016-05-16 12:07:32 UTC
This only seems to be a problem with many small packages. For large packages, the download speed seems to be the same. Tested on tremulous-data (100MB), both PK and DNF yields the same download speed:

$ time sudo dnf install tremulous-data --downloadonly -y
real	0m23.348s
user	0m3.166s
sys	0m2.299s
$ time sudo pkcon install -d tremulous-data
real	0m22.586s
user	0m0.035s
sys	0m0.019s

Comment 6 Adam Williamson 2016-05-20 15:24:25 UTC
doesn't dnf parallelize downloads to some extent? that could be part of this...

Comment 7 Rex Dieter 2016-05-20 15:25:34 UTC
dnf does indeed support parallel downloads

Comment 8 Kamil Páral 2016-05-23 13:15:08 UTC
(In reply to Adam Williamson from comment #6)
> doesn't dnf parallelize downloads to some extent? that could be part of
> this...

There's a larger variance when I force dnf to use a single download thread, however, that still not even close how slow PackageKit is:

$ time sudo dnf install -y --downloadonly frozen-bubble --setopt max_parallel_downloads=1

(5 runs)

real	0m8.883s
real	0m19.177s
real	0m10.442s
real	0m28.560s
real	0m10.140s


PackageKit times also have some variance, but they tend to be between 1min 30s to 3min.

Comment 9 Kalev Lember 2016-06-03 10:06:44 UTC
https://github.com/rpm-software-management/libhif/pull/62 has some initial work that should hopefully make this work better.

Comment 10 Kamil Páral 2016-06-06 13:54:44 UTC
Richard, Kalev, I have a suspicion that this could be CPU related, instead of network related. I was watching CPU usage during rpm download in PK today, and I noticed that packagekitd is often consuming huge amounts of CPU constantly for the whole duration of the download.

For something smaller, like `pkcon install -d frozen-bubble`, I see ~10% CPU usage by packagekitd during download.

For something larger, like `pkcon install -d texlive` (250 packages), I see ~40-50% CPU usage during download.

For offline upgrade (1500 packages), I see 100% CPU usage the whole time.

It seems pacakagekit is doing something very fishy (recomputing the transaction constantly?) and probably the more packages you download, the more CPU it uses. It's possible that the slow download speeds are caused by this and not by bad network code.

Comment 11 Kamil Páral 2016-06-06 14:01:45 UTC
(In reply to Kamil Páral from comment #10)
> For offline upgrade (1500 packages), I see 100% CPU usage the whole time.

I meant "distro upgrade" (F23->F24).

Comment 12 Kamil Páral 2016-06-06 14:17 UTC
Created attachment 1165218 [details]
network utilization

To support my theory, here's a screenshot of network utilization (as seen from the VM host) during PK downloading distro upgrade. The downlink jumps up once in ten seconds or so, quickly downloads the package, and then the network sits idle for another ~10 seconds, and again, and again.

Comment 13 Kamil Páral 2016-06-06 15:25 UTC
Created attachment 1165241 [details]
cpu utilization

And this is CPU graph from the same moment. The CPU is 100% almost all the time (2 cores in the VM), and slightly dips from time to time, which seems to correspond with the frequency of packages being downloaded.

So short download -> long period of computations -> short download -> long period of computations -> ...

Comment 14 Chris Murphy 2016-06-20 16:34:12 UTC
packagekitd has always used gobs of CPU on my machines since I can remember. A core is always pegged at 100% for the download so I don't think that can be considered a regression if even suboptimal; and the F23->F24 upgrade download was quite a bit slower: 6 hours for the graphical tool, and just under 3 hours for dnf system-upgrade.

Comment 15 Adam Williamson 2016-06-20 16:53:23 UTC
It's not a regression, but the impact is much more obvious with the graphical system upgrade. Typical GNOME Software use is just to install one or two packages, where this isn't obvious; an offline update does involve a lot of downloads and has always been affected by this, but the user is notified *after* all the packages are downloaded in that case, so usually will not notice that the download was slow (as they had no idea it was happening). In the graphical upgrade case, you're notified *before* the downloads happen, so you get excited for the new release, then have to wait hours for the downloads.

I'm wondering if PK is perhaps going out and re-doing some or all of the metadata downloads for *each* package download? If so that would explain some of the slowness and also be a problem as it'd be a big waste of bandwidth...

Comment 16 Kamil Páral 2016-06-21 10:17:19 UTC
Kalev has a fix [1] which makes downloads blazing fast for me. The fix is waiting for review.

[1] scratch build here: http://koji.fedoraproject.org/koji/taskinfo?taskID=14590410

Comment 18 Fedora Update System 2016-07-12 19:14:25 UTC
PackageKit-1.1.2-1.fc23 appstream-data-23-11.fc23 fwupd-0.7.2-1.fc23 gnome-shell-3.18.5-2.fc23 gnome-software-3.20.4-1.fc23 json-glib-1.2.0-1.fc23 libappstream-glib-0.5.16-2.fc23 libgusb-0.2.9-1.fc23 libhif-0.2.3-1.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-fad11727bf

Comment 19 Fedora Update System 2016-07-12 19:14:43 UTC
PackageKit-1.1.2-1.fc23 appstream-data-23-11.fc23 fwupd-0.7.2-1.fc23 gnome-shell-3.18.5-2.fc23 gnome-software-3.20.4-1.fc23 json-glib-1.2.0-1.fc23 libappstream-glib-0.5.16-2.fc23 libgusb-0.2.9-1.fc23 libhif-0.2.3-1.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-fad11727bf

Comment 20 Kamil Páral 2016-07-13 12:30:52 UTC
This is now definitely much better with updates from comment 19. However, PK now seems to emit an insane amount of information into pkmon, about 1000 lines per second (!!). I assume that's why I see 30% cpu usage from packagekitd and 25% cpu usage from gnome-software during the download (when downloading 3 MB/s).

Kalev, do you want me to report a separate bug about this?

Comment 21 Kamil Páral 2016-07-13 13:51:41 UTC
One more related issue, memory consumption of packagekitd and gnome-software has risen tremendously. After downloading all system updates, packagekitd now consumes 630 MB RAM and gnome-software 830 MB RAM (or vice versa, I don't remember exactly; looking at RES usage in htop). When I compare it to build previously in updates-testing, packagekitd consumes 210 MB RAM and gnome-software 190MB RAM and it doesn't seem to increase during the download (I didn't have the patience to wait for the full download, since it's very slow).

The new bump in memory requirements means that my 2GB RAM VM was not able to handle the upgrade correctly. I was able to download all packages, but I wasn't able to do anything afterwards - install a package in gnome-software, use dnf, run firefox - all failed with "failed to allocate memory".

This is again probably related to the millions of lines of output/events packagekit now produces (as seen in pkmon).

Comment 22 Fedora Update System 2016-07-14 01:25:08 UTC
PackageKit-1.1.2-1.fc23, appstream-data-23-11.fc23, fwupd-0.7.2-1.fc23, gnome-shell-3.18.5-2.fc23, gnome-software-3.20.4-1.fc23, json-glib-1.2.0-1.fc23, libappstream-glib-0.5.16-2.fc23, libgusb-0.2.9-1.fc23, libhif-0.2.3-1.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-fad11727bf

Comment 23 Kalev Lember 2016-07-14 11:16:10 UTC
(In reply to Kamil Páral from comment #20)
> This is now definitely much better with updates from comment 19. However, PK
> now seems to emit an insane amount of information into pkmon, about 1000
> lines per second (!!). I assume that's why I see 30% cpu usage from
> packagekitd and 25% cpu usage from gnome-software during the download (when
> downloading 3 MB/s).

Should be fixed with https://github.com/hughsie/PackageKit/commit/27061c90b8e98e93b587c7c9dc29e4f82f590260 which should fix the memory consumption as well I think

Comment 24 Fedora Update System 2016-07-14 13:27:59 UTC
PackageKit-1.1.3-1.fc23 appstream-data-23-11.fc23 fwupd-0.7.2-2.fc23 gnome-shell-3.18.5-2.fc23 gnome-software-3.20.4-1.fc23 json-glib-1.2.0-1.fc23 libappstream-glib-0.5.16-2.fc23 libgusb-0.2.9-1.fc23 libhif-0.2.3-1.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-fad11727bf

Comment 25 Kamil Páral 2016-07-14 13:51:58 UTC
With the updates from comment 24, the amount of output in pkmon is back to normal, the cpu usage is normal (10% packagekitd, 0% gnome-software), and the memory consumption doesn't increase at all over idle state (around 150-170 MB RAM for each of the processes). And the downloads are blazing fast, of course. Great job.

Comment 26 Fedora Update System 2016-07-15 17:52:37 UTC
PackageKit-1.1.3-1.fc23, appstream-data-23-11.fc23, fwupd-0.7.2-2.fc23, gnome-shell-3.18.5-2.fc23, gnome-software-3.20.4-1.fc23, json-glib-1.2.0-1.fc23, libappstream-glib-0.5.16-2.fc23, libgusb-0.2.9-1.fc23, libhif-0.2.3-1.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-fad11727bf

Comment 27 Fedora Update System 2016-07-20 23:49:37 UTC
PackageKit-1.1.3-1.fc23, appstream-data-23-11.fc23, fwupd-0.7.2-2.fc23, gnome-shell-3.18.5-2.fc23, gnome-software-3.20.4-1.fc23, json-glib-1.2.0-1.fc23, libappstream-glib-0.5.16-2.fc23, libgusb-0.2.9-1.fc23, libhif-0.2.3-1.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.