Bug 991694

Summary: deltarpm rebuild very slow after yum update
Product: [Fedora] Fedora Reporter: Adam Pribyl <covex>
Component: yumAssignee: Packaging Maintenance Team <packaging-team-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 19CC: admiller, covex, ffesti, firas.alkafri, kparal, packaging-team-maint, stanley.king, tim.lauridsen, zpavlas
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: yum-3.4.3-106.fc19 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-08-15 23:29:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Adam Pribyl 2013-08-03 18:21:36 UTC
Description of problem:
I am not sure when exactly this happen but one of these updates

Jul 26 09:29:04 Installed: yum-3.4.3-102.fc19.noarch
Jul 26 09:29:04 Erased: yum-presto-0.9.0-1.fc18.noarch
Jul 26 09:55:38 Updated: yum-utils-1.1.31-15.fc19.noarch
Jul 26 09:58:41 Updated: yum-metadata-parser-1.1.4-8.fc19.x86_64
Jul 30 09:06:11 Updated: yum-3.4.3-103.fc19.noarch
Jul 30 09:07:06 Updated: yum-utils-1.1.31-16.fc19.noarch

cause that now delta rebuild for me drops in speed to 91kB/s in average, it is extremely slow and consumes all CPU cycles.  The progress bar is choppy and shows the progress very seldom.

Yum-presto was removed during update from f18 to f19. (Jul 26)

How reproducible:
Always, I treid to make a video however it makes is almost impossible to record anything during delta rebuild.

Steps to Reproduce:
1. just yum update with deltas enabled

Actual results:
91kB/s

Expected results:
1.2MB/s as it was with F18.

Additional info:
Tried to clean yum cache and rebuild rpm db. No change.

Comment 1 Zdeněk Pavlas 2013-08-05 11:43:48 UTC
Hi, you may want to tweak the deltarpm option in yum.conf (man yum.conf).  If you have fast disk and many cores to spare, increasing this helps.  When merging deltarpm code do Yum, we found that usually the I/O was the bottleneck, and running too many applydeltarpm workers was mostly (but not always) slower, so we set the default to 2.  And yes, delta rebuilds are slow.. always have been..

Comment 2 Adam Pribyl 2013-08-05 16:36:02 UTC
This is a Athlon X2 with SSD. The change with latest yum/deltarpm is so significant it makes deltarpm a really unusable. I may provide a direct comparison of F18 and F19 showing what it does now - it is 100 times slower. This is not a usual "works slow", this is a disaster. Please watch this video: http://sandbox.cz/~covex/FILE/yum-delta.ogg
I've cut it after several seconds of no progress. 

Let me repeat: on F18 the rebuild speed is around 1MB/s, on F19 it is 90kB/s!

Comment 3 Zdeněk Pavlas 2013-08-06 14:48:54 UTC
I see, thanks for the video.. Yum spawns just one aplydeltarpm process.  Could you try setting "deltarpm=8" in yum.conf?

Comment 4 Kamil Páral 2013-08-06 17:34:40 UTC
I have the same problem, it seems that very often yum uses only a single core for deltarpms.

I tried to test this with F19 Live image and I couldn't really reproduce it. By default it has:
yum-3.4.3-99.fc19.noarch
rpm-4.11.0.1-2.fc19.x86_64

When I run yum update, I see two applydeltarpm processes running most of the time. That's correct. If I try with deltarpm=8 in yum.conf, I see many (I guess 8) applydeltarpm processes running. That's correct.

If I update to:
yum-3.4.3-104.fc19.noarch
rpm-4.11.1-1.fc19.x86_64
there seemed to be no change, still multiple applydeltarpm processes.

So I'm quite confused what the problem is.

I noticed one weird thing, however. During package download, the number of applydeltarpm processes was close to "max" (e.g. 8) all the time. When some applydeltarpm process finished, a new one was immediately started. But after the packages were downloaded and there was this "Rebuilding RPMs..." progress bar, it seemed to behave differently. There were 8 applydeltarpm processes at the start, then the number decreased gradually, and eventually there was only one (the slowest one). After the last one has completed, again there appeared new 8 processes.

So, it seems that in certain cases it doesn't work completely properly (utilizing all the cores), but it works semi-parallel: start a group of processes in parallel, then wait for all of them to finish, and only then start a new group of processes in parallel.

I'm not totally sure it behaves this way, but it was my observation. It could explain why F19 rebuilding takes slower and sometimes it looks like single-threaded - when the process group contains a lot of small drpms and a single large one, it is not much faster than running it in just a single thread.

Zdenek, what do you think? Is there something specific I can test to provide more info?

Comment 5 Adam Pribyl 2013-08-06 19:20:36 UTC
I did not any thorough observation of number of applydeltarpms but I can confirm, that it seems there are two applydeltarpms at the beginning of Rebuilding RPMs but it very soon drops do a single one and I did not noticed there two any more in 'top'. Update in past days bring an libreoffice update, which is usually big file, compared to the other ones. I tried now 'yum update libreoffice\*' also under F19 live CD in kvm and it looks there appears applydeltarpm <defunct> for a while before process disappears and really the number of applydeltarpms is decreasing with time form 8 to 1.

Comment 6 Zdeněk Pavlas 2013-08-07 13:49:02 UTC
Thanks for thorough testing.. seems there's a bug in the schedulling of remaining applydelatrpm jobs.

Comment 7 Zdeněk Pavlas 2013-08-07 14:29:07 UTC
http://yum.baseurl.org/gitweb?p=yum.git;a=commitdiff;h=a02561342b3340c0ae59ce14562544438261d7ca

Fixed by this upstream commit, I believe.  I'll make a release soon, probably next week. (there's likely more to come, and this one is not critical).

Comment 8 Fedora Update System 2013-08-13 15:08:22 UTC
yum-3.4.3-106.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/yum-3.4.3-106.fc19

Comment 9 Kamil Páral 2013-08-14 09:06:12 UTC
The problem is fixed with yum-3.4.3-106.fc19. There are $deltarpm processes all the time.

The speed indicator is still quite weird though. It jumps instantly from 400kB/s to 8MB/s and back. It seems to be just a cosmetic thing, the rebuild process runs at full speed now.

Thanks.

Comment 10 Zdeněk Pavlas 2013-08-14 10:57:16 UTC
Well, we can't do much with that.. The progress bar is updated when each delta rebuild job finishes, and ignores the partial progress of other jobs.  So when A & B are 1MB each and run in parallel, and A runs 60s and B runs 61s, the apparent speed is 1MB/60s, then 1MB/1s.. The progress code does some smoothing, but.. There's no simple way to know the partial progress of running jobs, so we have to live with this.

Comment 11 Kamil Páral 2013-08-14 11:12:04 UTC
(In reply to Zdeněk Pavlas from comment #10)
You could count 1MB/60s and then 2MB/61s :-) It wouldn't show the actual speed, but the average speed for the whole run, but... the actual speed is not shown anyway, as you say. So, I don't see any harm in showing average speed instead.

Comment 12 Zdeněk Pavlas 2013-08-14 11:40:00 UTC
The current code uses temporal moving moving average, since DL speed is often not constant.  But we can change the timescale. This is a hack (we should at least put the old value back when rebuilds are finished), but it should help a lot..  Could you test?

diff --git a/yum/drpm.py b/yum/drpm.py
index 464f17c..54b04a7 100644
--- a/yum/drpm.py
+++ b/yum/drpm.py
@@ -282,6 +282,10 @@ class DeltaInfo:
                 if hasattr(progress, 'text_meter_total_size'):
                     progress.text_meter_total_size(0)
                 self.progress = po.repo.callback
+                # default timescale 5s works fine with 0.3s dl updates.
+                # drpm rebuild jobs do not finish that often, so bump it
+                try: self.progress.re.timescale = 30
+                except AttributeError: pass # acessing private api
                 self.progress.start(filename=None, url=None, # BZ 963023
                                     text='<locally rebuilding deltarpms>', size
                 self.done = 0

Comment 13 Adam Pribyl 2013-08-14 12:20:54 UTC
With then new -106 yum I see both processes, rebuild indicates +400kB/s rebuild speed, much better. The process bar is weired even with this patch. Without the patch it could happen it just sits there with a message 
Finishing delta rebuilds of 12 package(s) (17 M)
without any progress bar at all, then it just jumps to the half of 100% with a 0B/s rebuild speed. Not sure how it was implemented before, but maybe removing the progress bar would be better. Making just endless animation like "\ | / -" could also do the job...

Comment 14 Kamil Páral 2013-08-14 13:03:22 UTC
(In reply to Zdeněk Pavlas from comment #12)
After applying the patch, the speed indicator was more stable and smoother, and generally it felt much better. Of course still not perfect, but in my case it definitely improved the experience.

Comment 15 Fedora Update System 2013-08-15 03:05:40 UTC
Package yum-3.4.3-106.fc19:
* should fix your issue,
* was pushed to the Fedora 19 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing yum-3.4.3-106.fc19'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-14811/yum-3.4.3-106.fc19
then log in and leave karma (feedback).

Comment 16 Stan King 2013-08-15 03:17:22 UTC
Zdeněk, I gave the yum-3.4.3-106 version a couple tries, by manually installing it before it was pushed.  I agree with Kamil and Adam that it seems much better.

However, it reminded me of another of something else that seems unusual and may be related.  It appears to start running "applydeltarpm" concurrent with the downloads.  Then, when the downloads finish, the progress indicator begins at zero.  Has it thrown away the results of any progress it made applying delta RPM's during the download, or does it just seem that way?

In Fedora 18 and earlier, the progress indicator was generally non-zero after the download completed, giving one the confidence it had overlapped the downloading and the application of deltas.  Now it appears as if it's starting over.

By the way, I'd like to congratulate you and whoever else developed this feature, as it's one of the few routine activities that drives all the CPU's cores like they were meant to be driven.

Comment 17 Zdeněk Pavlas 2013-08-15 07:52:11 UTC
> Then, when the downloads finish, the progress indicator begins at zero.  Has it thrown away the results of any progress it made applying delta RPM's during the download, or does it just seem that way?

Delta rebuilds that have already finished while downloading are not included in the "finishing delta rebuilds" progress bar's total, so it starts from zero (the #packages reported in the "finishing" message is usually smaller than the total number of drpms downloaded).  It's intentional (not sure if actually "right" or "better").. But I'm sure there were no changes in this code since F18.

Comment 18 Fedora Update System 2013-08-15 23:29:46 UTC
yum-3.4.3-106.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 19 Stan King 2013-08-15 23:38:37 UTC
(In reply to Zdeněk Pavlas from comment #17)
> > Then, when the downloads finish, the progress indicator begins at zero.  Has it thrown away the results of any progress it made applying delta RPM's during the download, or does it just seem that way?
> 
> Delta rebuilds that have already finished while downloading are not included
> in the "finishing delta rebuilds" progress bar's total, so it starts from
> zero (the #packages reported in the "finishing" message is usually smaller
> than the total number of drpms downloaded).  It's intentional (not sure if
> actually "right" or "better").. But I'm sure there were no changes in this
> code since F18.

No changes except that now it mixes the downloading of delta RPMs and straight RPMs.  Perhaps that new section is passing a different value along to the unchanged section.