Bug 781731 - [RFE] use threads to speed-up the debuginfo downloading
Summary: [RFE] use threads to speed-up the debuginfo downloading
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: abrt
Version: rawhide
Hardware: x86_64
OS: Linux
low
unspecified
Target Milestone: ---
Assignee: abrt
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ABRTF18
TreeView+ depends on / blocked
 
Reported: 2012-01-14 17:28 UTC by John Reiser
Modified: 2020-08-04 13:52 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: ---
Embargoed:


Attachments (Terms of Use)

Description John Reiser 2012-01-14 17:28:02 UTC
Description of problem: abrt-gui is slow when downloading and processing debuginfo.  One significant problem is using multiple passes.  First the debuginfo.rpm is downloaded and written to a file.  Then the .rpm file is converted to cpio, and written to a file.  Then the .cpio archive is extracted into files.  This is three passes when a one-pass, three-stage pipeline could be used instead:  download  |  rpm2cpio  | cpio --extract.  The savings in wall-clock time would be significant, probably all overlapped with the download itself.  I just got done waiting 20 minutes for debuginfo processing on bug #781729 (SIGSEGV in gnome-shell), and the wait greatly degrades my motivation to use abrt.


Version-Release number of selected component (if applicable):
abrt-2.0.7-3.fc17.x86_64


How reproducible: every time


Steps to Reproduce:
1. Inovke abrt-gui and use local gdb to process debuginfo for an application  crash.
2.
3.
  
Actual results: The missing debuginfo for each package is downloaded, converted to cpio, and extracted to files using separate passes with temporary files between each pass.


Expected results: The missing debuginfo is obtained much more quickly by using a single pass pipeline.


Additional info:

Comment 1 Denys Vlasenko 2012-01-15 15:41:06 UTC
(In reply to comment #0)
> Description of problem: abrt-gui is slow when downloading and processing
> debuginfo.  One significant problem is using multiple passes.  First the
> debuginfo.rpm is downloaded and written to a file.  Then the .rpm file is
> converted to cpio, and written to a file.  Then the .cpio archive is extracted
> into files.  This is three passes when a one-pass, three-stage pipeline could
> be used instead:  download  |  rpm2cpio  | cpio --extract

You are right, we do use intermediate files.
But I contend you are mistaken about the impact of it: the vast majority of the time (I'd hazard to guess ~95%) is spent on downloading.

A pipe  download  |  rpm2cpio  | cpio --extract  will be harder to debug if (or rather, *when*) problems occur. For starters, standard shell (meaning: not using bash-specific extensions) does not even provide the means of detecting when intermediate steps in pipe exited with nonzero status. Yes, one or other way we can do that, but the small improvements aren't worth the needed significant code complications in this area.

> The savings in wall-clock time would be significant, probably all overlapped with the download itself.  I just got done waiting 20 minutes for debuginfo processing on bug #781729 (SIGSEGV in gnome-shell), and the wait greatly degrades my motivation to use abrt.

We know about it, and we try to improve this situation, but it's not going to happen quickly...

Comment 2 John Reiser 2012-01-15 19:58:08 UTC
Sometimes downloading is the bottleneck.  I suggest: sort the list of debuginfo .rpm by download size, then download using two parallel threads: one processing from the small end of the list, the other processing from the large end of the list.  This tends to maintain high throughput [thus earliest finish] despite per-package overhead, and with acceptable server load and traffic congestion.


Note You need to log in before you can comment on or make changes to this bug.