Bug 185590 - debuginfo packaging granularity too large, should go with binary rpms
debuginfo packaging granularity too large, should go with binary rpms
Status: ASSIGNED
Product: Fedora
Classification: Fedora
Component: rpm (Show other bugs)
rawhide
All Linux
medium Severity medium
: ---
: ---
Assigned To: Fedora Packaging Toolset Team
Mike McLean
: FutureFeature, Reopened
Depends On:
Blocks: 573532
  Show dependency treegraph
 
Reported: 2006-03-15 18:40 EST by Frank Ch. Eigler
Modified: 2016-05-21 19:22 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-03-16 07:39:07 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Frank Ch. Eigler 2006-03-15 18:40:08 EST
Description of problem:

As a matter of policy, "-debuginfo" RPMs are generated out of an rpmbuild tree
belonging to a .src.rpm.  The problem is that if multiple sub-packages were
built (such as "kernel", "kernel-smp", etc.), all of the objects for all
variants are included in the single "kernel-debuginfo".  So, in order to use a
tool like systemtap or oprofile or kdump on a single installed kernel, one still
has to install several times that amount (hundreds of MB for the kernel) of
useless data for the other variants.

It would make more sense to have a single debuginfo per binary rpm rather than
src rpm.  (Sure, the source code may get replicated between different
subpackages, but that's tiny compared to the object files, and could be shared
upon installation.)

Let's please re-examine this policy, or at least justify it better.
Comment 1 Jeff Johnson 2006-03-16 07:39:07 EST
If you want -debuginfo per-package, then disable the automagic generation of -debuginfo
and stripping, and just add symbols to the executables.

Any other scheme that divvies up files into smaller per-binary-pkg -debuginfo packages
is unlikely to be workable. E.g. consider what happens when a library is used for several
binary packages. The library -debuginfo would have to be shared or included for each package.
And if not a library, just a file that is linked into several packages, then there is no obvious
way to choose where the file -debuginfo needs to be put.
Comment 2 Frank Ch. Eigler 2006-03-16 08:45:40 EST
Re libraries: if the library comes from a -devel package, then the debuginfo
should go with the library, not the client application, same as today.

Re an internal file/library that is linked into several distinct packages:
*each* sub-debuginfo could contain that file's .debug stuff, along with its
sources.  (Once installed, of course only one copy need be retained.)  Are there
many sizeable instances of this in Fedora that you know of?
Comment 3 Matt McCutchen 2010-04-13 22:31:09 EDT
The argument given in comment #1 for WONTFIXing this is invalid, as explained in comment #2.  Duplication of source files can even be avoided by putting them in a -source package that is required by all the -debuginfo packages.  Please reopen.
Comment 4 Jeff Johnson 2010-06-16 17:26:16 EDT
FYI: OpenSuSE implemented per-subpackage -debuginfo last July.
I suggest that you think very carefully before incorporating the
change, because the no. of -debuginfo packages will go up
significantly with per-subpackage -debuginfo and its not at
all clear how the mapping of symbols back to source lines
could or should be done with per-subpackage -debuginfo.s

B\Have fun!
Comment 5 Matt McCutchen 2010-06-16 19:36:28 EDT
(In reply to comment #4)
> FYI: OpenSuSE implemented per-subpackage -debuginfo last July.

Do you have a link to a description of their experience to share?  I didn't find any in a quick search.

> I suggest that you think very carefully before incorporating the
> change, because the no. of -debuginfo packages will go up
> significantly with per-subpackage -debuginfo

True, but the number of debuginfo packages is still bounded by the total number of packages containing binaries.  Here are the package counts for Fedora 13 x86_64 as originally released:

repo id               repo name                                 status
fedora                Fedora 13 - x86_64                        20,840
fedora-debuginfo      Fedora 13 - x86_64 - Debug                 5,142

Here are the counts for "fedora" by architecture:

      1 i386
     14 i586
   4068 i686
   5762 noarch
  10995 x86_64

So we're looking at approximately doubling the number of debuginfo packages, and maybe later including some secondary-architecture debuginfos for bug 573532.  I think the infrastructure will be able to handle it.

> its not at
> all clear how the mapping of symbols back to source lines
> could or should be done with per-subpackage -debuginfo.s

The debuginfo file specifies the source file path, which I presume is looked up under /usr/src/debug.  I don't see how per-subpackage debuginfo packages would make any difference.  Did you mean something else?

(In reply to comment #3)
> Duplication of source files can even be avoided by putting them
> in a -source package that is required by all the -debuginfo packages.

It looks like OpenSUSE has been separating the source out, perhaps even before they moved to per-subpackage debuginfos:

http://en.opensuse.org/Packaging/Debuginfo
Comment 6 Jeff Johnson 2010-06-16 19:53:15 EDT
Nope. I don't use OpenSUSE and I believe (based on my own experiences)
that per-subpackage -debuginfo is A Very Bad Idea.

Bounded by the total number of packages containing binaries
is entirely irrelevant.

Its very clear that per-package -debuginfo will
    1) result in duplication of source and symbols (verified Yet Again today
    that there are still significant amounts of duplication in DWARF symbols)
    2) significantly increase the number of packages that need to be managed
    and signed and downloaded and ...
 
There are no clearly stated benefits that I have read today either.

Note that -debuginfo "automation" is a serious amount of work.
Don't believe me, go ask anyone who has had to do any of the work
producing -debuginfo packages.

Clearly you haven't gotten your hands dirty or you would understand
better why per-package -debuginfo is non-trivial.

Go try and see. The whole mechanism to produce -debuginfo is poorly
designed, insanely complicated and too fragile.
Comment 7 Frank Ch. Eigler 2010-06-16 20:11:26 EDT
> Its very clear that per-package -debuginfo will
>     1) result in duplication of source and symbols (verified Yet Again today
>     that there are still significant amounts of duplication in DWARF symbols)

You must have misunderstood this aspect of our conversation.  The duplication
is intra-dwarf file.  There would be no duplication across subrpm debuginfo
files, as these constitute nonoverlapping subsets of the original
srcrpm-debuginfo.
Comment 8 Matt McCutchen 2010-06-16 20:14:09 EDT
(In reply to comment #6)
> There are no clearly stated benefits that I have read today either.

The benefit, as stated in comment #0, is that users can download only the debuginfo subpackages they want.

> Its very clear that per-package -debuginfo will
>     1) result in duplication of source and symbols (verified Yet Again today
>     that there are still significant amounts of duplication in DWARF symbols)

How so?  The exact same debuginfo files would be generated; they would just be spread across more packages.

>     2) significantly increase the number of packages that need to be managed
>     and signed

Yes, but weigh that against the benefit.

>     and downloaded and ...

Not necessarily.  If the user only wanted debuginfo for one subpackage, the number of packages downloaded does not increase.  :)

> Clearly you haven't gotten your hands dirty or you would understand
> better why per-package -debuginfo is non-trivial.

Indeed, I haven't.  So would you be kind enough to explain why?

The generation of the debuginfo files would stay the same.  The difference is that each debuginfo file and its build-id symlink would go into the debuginfo subpackage corresponding to the subpackage containing the original file, instead of all into a single debuginfo package.  It's conceptually simple, but I don't know how easy it is at that point in the build process to query which original files are going into which subpackages.
Comment 9 Jeff Johnson 2010-06-16 20:27:21 EDT
I should point out the fundamental flaw with -debuginfo and
debugedit in case anyone _REALLY- wishes to fix.

When an ELF binary is built, debugging symbols are written
into various ELF sections (details don't matter).

The process undertaken by debugedit is that the sections are moved
from their original per-executable sections. The sections in the
executable are marked invalid, and the debugging symbols
are written into a detached section that is then packaged up
in a -debuginfo package automagically (again details matter little).

The flaw (in debugedit) is that paths to sources are rewritten
DIRECTLY IN THE ORGINAL ELF SECTION.

The corollary to rewriting paths in a fixed size object is that
only shorter paths can be substituted.

Go try building on a very short path (like /X/NAME) and watch
the whole process go tits up with the problem being reported
with this error message:
 
    debugedit.c:		   "canonicalization unexpectedly shrank by one character");

or other insanities.

The whole process done by debugedit is essentially
the same as running sed on executables to change
path prefixes (modulo details that _DO_ matter)

NAd this from "professional" programmers who know the issue
and are quite able to understand why you CANNOT put longer
strings into a fixed size ELF section.

The ELF cabal could/should resize the section that contains file
paths so that the paths do NOT always have to be shorter.

Please note that its the process and implementation for -debuginfo
package production that hasn't been repaired for years.

Using a buildid, and having detached debugging symbols and
attempting transsparent -debuginfo production are savable.

But the flaw in debugedit is just sad and pathethic, particularly since
there's no essential change to a quick and dirty hack >5 years later.
Comment 10 Jeff Johnson 2010-06-16 20:29:53 EDT
re comment #7: I did not misunderstand the question.
Nor did I claim anything other than "duplication" here.
Whether its intra-dwarf on inter-package (as witth multiply
included sources),
    duplication == duplication
Comment 11 Jeff Johnson 2010-06-16 20:37:47 EDT
re comment # 8:

I fully understand the stated goal that
    Lusers can download only the -debuginfo packages they want.

But (as you know since we've argued through 5-6 of these -debuginfo
bug reports), it ain't that simple.

Permitting per-subpackage -debuginfo doesn't solve multilib
file conflicts, nor does it solve the RFE for dependencies
so that -debuginfo is upgraded automagically when a package
is upgraded (ther are other solutions possible, just unrelated
to whether symbols are in one or multiple packages).

In fact, "upgrade" for -debuginfo is a rather different operation
than for other software which largely just needs to know "newer".

For -debuginfo its "newer" in the executable/libraray package that
has to trigger a -debuginfo package upgrade. And that cannot
simply be insturmented by adding Requres: somewhere and
    Let yum figger it out!
Comment 12 Matt McCutchen 2010-06-16 23:34:20 EDT
Comment #9 has nothing to do with this bug.  Anyone who cares about it should file a separate bug.

(In reply to comment #11)
> re comment # 8:
> 
> I fully understand the stated goal that
>     Lusers can download only the -debuginfo packages they want.
> 
> But (as you know since we've argued through 5-6 of these -debuginfo
> bug reports), it ain't that simple.

You have argued no such thing in the other bugs.

> Permitting per-subpackage -debuginfo doesn't solve multilib
> file conflicts,

It helps because one will no longer get conflicts in the non-multilib portions of the component.  The second half is handling executables in the multilib portion (bug 573532).

> nor does it solve the RFE for dependencies
> so that -debuginfo is upgraded automagically when a package
> is upgraded (ther are other solutions possible, just unrelated
> to whether symbols are in one or multiple packages).
> 
> In fact, "upgrade" for -debuginfo is a rather different operation
> than for other software which largely just needs to know "newer".
> 
> For -debuginfo its "newer" in the executable/libraray package that
> has to trigger a -debuginfo package upgrade. And that cannot
> simply be insturmented by adding Requres: somewhere and
>     Let yum figger it out!    

Those are all separate issues, and hardly a reason not to fix this one.
Comment 13 Jeff Johnson 2010-06-16 23:38:57 EDT
So open a bug report re broken debuginfo if you care. The issue
is real, and is directly tied to producing ALL -debuginfo pkgs
non matter what you might say or think.

Do I have to itemize every bug report? Get a grip ...

So fix the issue. Nothing in this bug report fixes anything.
Comment 14 Matt McCutchen 2010-06-17 00:00:41 EDT
Here is the OpenSUSE patch:

https://build.opensuse.org/package/view_file?file=debugsubpkg.diff&package=rpm&project=openSUSE%3AFactory

It looks like they disabled the %debug_package macro and added code to rpmbuild itself that, after writing out a binary subpackage, reads back its file list and gathers up the corresponding debuginfo files and build-id symlinks into a debuginfo subpackage.  That's probably the most reasonable approach if one is willing to modify the C code.
Comment 15 Jeff Johnson 2010-06-17 00:03:40 EDT
Hint: apply the patch, build rpm, and confirm. Your "looks like"
is just useless opinion.

And if you're not willing to change C code, well, good luck! fixing anything.
Comment 16 Matt McCutchen 2010-06-17 02:04:17 EDT
(In reply to comment #3)
> Duplication of source files can even be avoided by putting them
> in a -source package that is required by all the -debuginfo packages.

To clarify, I was suggesting putting all the debug source for a component in one package to keep matters simple.  That would mean not getting the download savings for debug source.  But it seems that the source tends to be smaller than the symbols, at least uncompressed.  Here's the rather hacky shell command I used to check the sizes:

$ p=glibc-debuginfo; for c in /usr/lib/debug /usr/src/debug; do \
    du -c $(rpm -qvl $p | sed -nre "s,^-.*($c.*)\$,\\1,p") | tail -n 1; done
62040	total    # symbols
39892	total    # source

Here's the OpenSUSE patch for debug source (pretty straightforward changes to %debug_package and find-debuginfo.sh):

https://build.opensuse.org/package/view_file?file=debugsource-package.diff&package=rpm&project=openSUSE%3AFactory
Comment 17 Jeff Raber 2010-07-28 00:12:36 EDT
This bug was reported against Fedora 5 which reach EOL a LONG TIME AGO.  Please
change the version number of this bug, or close it.  There should not be open
bugs against EOL versions.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Bug Reporter: If you would still like to see this bug fixed and are able to
reproduce it against a later version of Fedora please change the 'version' of
this bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 18 Fedora Admin XMLRPC Client 2012-04-13 19:12:14 EDT
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 19 Fedora Admin XMLRPC Client 2012-04-13 19:13:59 EDT
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 20 Frank Ch. Eigler 2016-05-21 19:22:10 EDT
Some of Jeff's concerns should be allayed by changes being proposed within [1].

[1] http://fedoraproject.org/wiki/Changes/ParallelInstallableDebuginfo

Note You need to log in before you can comment on or make changes to this bug.