This service will be undergoing maintenance at 20:00 UTC, 2017-04-03. It is expected to last about 30 minutes
Bug 1382394 - find-debuginfo.sh on rust libraries adds extra unallocated data in minisymtab ELF image
find-debuginfo.sh on rust libraries adds extra unallocated data in minisymtab...
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: rpm (Show other bugs)
rawhide
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: packaging-team-maint
Fedora Extras Quality Assurance
:
Depends On: 1380961
Blocks:
  Show dependency treegraph
 
Reported: 2016-10-06 10:34 EDT by Mark Wielaard
Modified: 2016-11-09 00:20 EST (History)
15 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1380961
Environment:
Last Closed: 2016-11-09 00:20:17 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Mark Wielaard 2016-10-06 10:34:11 EDT
add_minidebug() in find-debuginfo.sh is supposed to create a compressed ELF image that only contains a minisymtab table of function symbols stripped from the file.

But it also (accidentially) adds any unallocated sections to the ELF image (that will be moved to the .debug file). This is wrong and creates much larger files than necessary with duplicate data.

+++ This bug was initially created as a clone of Bug #1380961 +++

Description of problem:
The eu-strip call in find-debuginfo.sh is breaking rust libraries.  In addition to normal debug data, it is removing the .rustc metadata section.  That's fine, since Rust has no stable ABI, I don't want folks linking to these.  But now the library's own dynamic symbols no longer resolve, as seen by "ldd -r".

Version-Release number of selected component (if applicable):
elfutils-0.167-1.fc26.x86_64

How reproducible:
100%

Steps to Reproduce:
1. eu-strip libstd-40393716.so
2. ldd -r libstd-40393716.so

Actual results:
Hundreds of "undefined symbol: _Z...."

Expected results:
Clean ldd output.

Additional info:
You can see a broken result in the build of rust-1.12.0-1.fc26:
http://koji.fedoraproject.org/koji/buildinfo?buildID=805906

I will also attach the libstd mentioned above, before and after stripping.

I tried with binutils, also removing the metadata, and it's fine:
  strip -R .rustc libstd-40393716.so

In the rpm, I tried "objcopy -R .rustc" at the end of %install, so the metadata is already gone by the time eu-strip sees the files, but they still get broken.

I set my rpm to use "eu-strip -g" for just debuginfo stripping, which gets me past this for now.  That leaves the full symbol table and .rustc in place, so the libraries are bigger than they need to be, but it works.

--- Additional comment from Josh Stone on 2016-10-01 20:37 EDT ---



--- Additional comment from Josh Stone on 2016-10-01 20:38 EDT ---



--- Additional comment from Josh Stone on 2016-10-01 20:39 EDT ---



--- Additional comment from Josh Stone on 2016-10-01 20:55:41 EDT ---

The main difference between binutils and elfutils appears to be regarding .dynsym entry 1550 "rust_metadata_std_51f4f7aef635fcd3" which is basically the entire .rustc section.  Binutils left this entry alone, but eu-strip removed it.

If you're curious, here's the code where rustc writes that section.  Creating it as a global appears to be just an easy way to get LLVM to write arbitrary data into a section of rustc's choosing.
https://github.com/rust-lang/rust/blob/3191fbae9da539442351f883bdabcad0d72efcb6/src/librustc_trans/base.rs#L2248

--- Additional comment from Mark Wielaard on 2016-10-02 06:09:09 EDT ---

The .rustc section is a non-allocated section, which indicates it isn't needed at runtime, which is why it is moved to the separate debuginfo file.

The rust_metadata_std_51f4f7aef635fcd3 symbol should be in the .symtab section, not the .dynsym table, because it refers to a non-allocated section (29) while being in the .dynsym table, which only contains references to symbols needed at runtime in allocated sections.

--- Additional comment from Josh Stone on 2016-10-02 12:29:20 EDT ---

(In reply to Mark Wielaard from comment #5)
> The .rustc section is a non-allocated section, which indicates it isn't
> needed at runtime, which is why it is moved to the separate debuginfo file.

That's fine, although minidebuginfo will just put it back into .gnu_debugdata, and .rustc is already compressed so there's no space savings there.  If possible I'd rather strip it altogether, because it's not useful for debuginfo either.

There may come a day though, if Rust gets a stable ABI, where we *will* want to keep this data around.  It's unallocated because it's only needed at compile-time linking, so I think eu-strip is too eager to assume everything unallocated should be stripped.  Then 'eu-strip -g' is too far the other way, since that leaves all the symbol tables too.

We may need a middle ground, to tell eu-strip just to remove sections that are known unnecessary.  That seems to be how binutils' strip works, as it doesn't touch .rustc unless I explicitly use -R.

There's also an open Rust bug to move the metadata into separate files, which may make all of this moot for elfutils.

> The rust_metadata_std_51f4f7aef635fcd3 symbol should be in the .symtab
> section, not the .dynsym table, because it refers to a non-allocated section
> (29) while being in the .dynsym table, which only contains references to
> symbols needed at runtime in allocated sections.

That's logical, and perhaps I'll see if I can get LLVM to emit it into .symtab only.  It's actually fine with me if eu-strip wants to remove it from .dynsym, but it's breaking a lot of other symbols in the process, so its method appears to be unsound.  Perhaps it would be better to just zero-out that entry?

--- Additional comment from Mark Wielaard on 2016-10-02 18:47:58 EDT ---

(In reply to Josh Stone from comment #6)
> (In reply to Mark Wielaard from comment #5)
> > The .rustc section is a non-allocated section, which indicates it isn't
> > needed at runtime, which is why it is moved to the separate debuginfo file.
> 
> That's fine, although minidebuginfo will just put it back into
> .gnu_debugdata, and .rustc is already compressed so there's no space savings
> there.

It seems wrong if minidebuginfo adds the whole .rustc section. It is only supposed to create a compressed (skeleton) ELF image with a mini-symtab table that contains only function symbols.

>  If possible I'd rather strip it altogether, because it's not useful
> for debuginfo either.

Yes, I am surprised rpmbuild doesn't seem to do that.
 
> There may come a day though, if Rust gets a stable ABI, where we *will* want
> to keep this data around.  It's unallocated because it's only needed at
> compile-time linking, so I think eu-strip is too eager to assume everything
> unallocated should be stripped.  Then 'eu-strip -g' is too far the other
> way, since that leaves all the symbol tables too.

I am not sure that linking against non-allocated symbols in ET_DYN files really makes sense.

> We may need a middle ground, to tell eu-strip just to remove sections that
> are known unnecessary.  That seems to be how binutils' strip works, as it
> doesn't touch .rustc unless I explicitly use -R.

I am not sure binutils strip really follows the ELF spec here. When we strip an ELF executable then the only thing we want to keep is what is necessary at runtime. But we do have some special rules already in eu-strip for things like the .comment section. So maybe we could come up with something like that for other sections if necessary.

> There's also an open Rust bug to move the metadata into separate files,
> which may make all of this moot for elfutils.

What does this metadata contain?
Maybe there is a reasonable way to express it as ELF data.

> It's actually fine with me if eu-strip wants to remove it
> from .dynsym, but it's breaking a lot of other symbols in the process, so
> its method appears to be unsound.  Perhaps it would be better to just
> zero-out that entry?

Aha. I see some of the other symbols aren't like the one you mentioned.
Those are actually symbols for allocated data used in RELA sections.
That does indeed seem wrong to strip away.

I'll take a look at what is going on here.

--- Additional comment from Mark Wielaard on 2016-10-03 11:26:53 EDT ---

(In reply to Mark Wielaard from comment #7)
> Aha. I see some of the other symbols aren't like the one you mentioned.
> Those are actually symbols for allocated data used in RELA sections.
> That does indeed seem wrong to strip away.
> 
> I'll take a look at what is going on here.

We don't strip away those symbols. But because this is the .dynsym table that we adjust (which we normally don't have to do because there are normally no "strippable" symbols in there that point to some unallocated section) we also have to adjust the gnu hash table. Which we don't do. ldd relies on the .gnu.hash section to resolve the dynsym symbols and so reports those symbols as missing...

--- Additional comment from Josh Stone on 2016-10-03 13:48:10 EDT ---

(In reply to Mark Wielaard from comment #7)
> It seems wrong if minidebuginfo adds the whole .rustc section. It is only
> supposed to create a compressed (skeleton) ELF image with a mini-symtab
> table that contains only function symbols.

add_minidebug() runs:

  objcopy -S --remove-section .gdb_index --remove-section .comment --keep-symbols="$keep_symbols" "$debuginfo" "$mini_debuginfo" &> /dev/null

But objcopy's idea of strip all (-S) does not include unknown unallocated sections the way eu-strip does.  So eu-strip pulls .rustc out into the debuginfo, and objcopy preserves .rustc when copying back.

I suppose this line could add .rustc as another explicit removal, like .gdb_index and .comment, but I still rather think eu-strip is being overly aggressive in the first place.

> >  If possible I'd rather strip it altogether, because it's not useful
> > for debuginfo either.
> 
> Yes, I am surprised rpmbuild doesn't seem to do that.

My preference would be to remove .rustc as a manual step though.  I don't expect the tools to decide whether this is needed, because those tools have no idea what this section is for.  When in doubt, leave it alone!

But I can't remove the section manually, because later eu-strip still removes the related .dynsym entry, breaking dynamic symbols in the same way.

> > There may come a day though, if Rust gets a stable ABI, where we *will* want
> > to keep this data around.  It's unallocated because it's only needed at
> > compile-time linking, so I think eu-strip is too eager to assume everything
> > unallocated should be stripped.  Then 'eu-strip -g' is too far the other
> > way, since that leaves all the symbol tables too.
> 
> I am not sure that linking against non-allocated symbols in ET_DYN files
> really makes sense.

Nothing is linking into .rustc -- I phrased that poorly.  I should just say that it's only needed at compilation time for code that wants to use this library, but that's actually in codegen well before the linking step.

> > We may need a middle ground, to tell eu-strip just to remove sections that
> > are known unnecessary.  That seems to be how binutils' strip works, as it
> > doesn't touch .rustc unless I explicitly use -R.
> 
> I am not sure binutils strip really follows the ELF spec here. When we strip
> an ELF executable then the only thing we want to keep is what is necessary
> at runtime. But we do have some special rules already in eu-strip for things
> like the .comment section. So maybe we could come up with something like
> that for other sections if necessary.

Does the spec describe strip?  I wholly agree that .rustc isn't needed at runtime, but that doesn't mean it's not needed at all.

> > There's also an open Rust bug to move the metadata into separate files,
> > which may make all of this moot for elfutils.
> 
> What does this metadata contain?
> Maybe there is a reasonable way to express it as ELF data.

I guess it's roughly analogous to a C++ header, especially one with templates.  It contains the AST of exported interfaces, as well as implementations for any inline-able functions and generic code that has yet to be monomorphized.

(In reply to Mark Wielaard from comment #8)
> (In reply to Mark Wielaard from comment #7)
> > Aha. I see some of the other symbols aren't like the one you mentioned.
> > Those are actually symbols for allocated data used in RELA sections.
> > That does indeed seem wrong to strip away.
> > 
> > I'll take a look at what is going on here.
> 
> We don't strip away those symbols. But because this is the .dynsym table
> that we adjust (which we normally don't have to do because there are
> normally no "strippable" symbols in there that point to some unallocated
> section) we also have to adjust the gnu hash table. Which we don't do. ldd
> relies on the .gnu.hash section to resolve the dynsym symbols and so reports
> those symbols as missing...

Right - only the one useless "rust_metadata" symbol is actually stripped, but many other symbols fail as a result.  And note that "ldd relies..." really means "ld.so relies ...", so these libraries are unusable as a result.

--- Additional comment from Roland McGrath on 2016-10-03 18:14:07 EDT ---

For "full" stripping (default mode), it's right to aggressively remove things.  binutils is sloppy here, and people have grown bad habits because of it.
The basic rule is "if it's not needed at runtime per ELF spec, it gets removed"; that approximately means only SHF_ALLOC sections should be kept.  We have two exceptions to this:

1. SHT_NOTE sections are kept (and also kept in separate debug files).  IIRC we did this originally specifically for the build ID note; it's kept in stripped files anyway because it's SHF_ALLOC, but we need it in the separate debug files even though it's SHF_ALLOC.  But this is also now the catch-all for stuff you wanted kept: put it in a SHT_NOTE section.

2. .gnu.warning.* sections.  These are kept because they are needed (well, useful) in stripped DSOs used at static link time.  If we were inventing .gnu.warning today, I think we'd make it an ELF note instead of a magically-named section.

I think for any case of "it's not part of the runtime image, but I want you not to strip it for some mumblefritz reason" the answer today should be "make it SHT_NOTE or GTFO".

--- Additional comment from Josh Stone on 2016-10-03 18:57:39 EDT ---

Funny thing is that up to Rust 1.11 it did use a note, but unnecessarily marked allocated, and that may lead to a huge ld.so alloca:
https://sourceware.org/bugzilla/show_bug.cgi?id=20419

Maybe the best choice would a note that's not alloc.  But this metadata is large, and we really don't want it duplicated.  In the case of libstd, the metadata is bigger than either the .text or the .debug_info!  So I would not like it copied to the debug files, and certainly not copied back into the .gnu_debugdata.

This is academic for now and the foreseeable future, because I don't want to expose any unstable Rust ABI to the distro.  For now, I would be happy wiping all signs of .rustc from existence.  We only need that data while compiling rustc itself, which does dynamically link its Rust libraries, but we should nuke it after.

FWIW, here's the Rust ABI issue: https://github.com/rust-lang/rfcs/issues/600
And for a separate metadata file: https://github.com/rust-lang/rust/issues/23366


Regardless of what *Rust* should do, there's a real bug in how eu-strip is removing the .dynsym entry.  Either it should update related sections like .gnu.hash, or find a different way to deal with that dangling entry, or just leave it alone.  Mark and I discussed on IRC that we could leave that entry in place and mark its section index UNDEF.

--- Additional comment from Roland McGrath on 2016-10-03 19:38:25 EDT ---

I think it's always a linker bug if .dynsym has an entry whose st_shndx is a section without SHF_ALLOC.  That said, eu-strip coping with such buggy input in a "least harm" way seems best.  And the other acceptable alternative would be for it to barf with an error and refuse to produce output from such bogus input.  But given the known existence of this linker bug, resetting its st_shndx to SHN_UNDEF is probably OK.  OTOH, it's entirely possible that this confuses some other tool, because it makes it look like there's a dangling undefined reference in the DSO (to things that look at all of .dynsym instead of just processing relocs like normal loading does).  For example, do all ldd modes (-v, -u, ?) handle it without getting confused?

--- Additional comment from Josh Stone on 2016-10-03 20:46:18 EDT ---

I manually edited that st_shndx, and I get this output from eu-elflint --gnu-ld:

> section [ 2] '.gnu.hash': symbol 1550 referenced in chain for bucket 479 is undefined

I can't find any ldd invocation that cares, FWIW.

--- Additional comment from Mark Wielaard on 2016-10-06 10:25:58 EDT ---

(In reply to Josh Stone from comment #13)
> I manually edited that st_shndx, and I get this output from eu-elflint
> --gnu-ld:
> 
> > section [ 2] '.gnu.hash': symbol 1550 referenced in chain for bucket 479 is undefined

I think it is reasonable for eu-elflint to complain about that.

> I can't find any ldd invocation that cares, FWIW.

Thanks for testing that.
I made a patch for eu-strip that does warn and set the symbol section to UNDEF.
https://lists.fedorahosted.org/archives/list/elfutils-devel@lists.fedorahosted.org/message/U2NSIOZ5CM6URT5Z6CR2SG733QJXTQ5N/

If it is convenient for you I can create a rawhide fedora package with that.
Comment 1 Josh Stone 2016-10-06 15:23:11 EDT
FWIW, once #1380961 lands, I intend to just remove .rustc ahead of time, so find-debuginfo.sh will never even see it.  However, if we make sure it won't get duplicated back into .gnu_debugdata, I could be persuaded to let .rustc remain in the debuginfo binaries.  I'm not sure there's any value in keeping it at all though.
Comment 2 Mark Wielaard 2016-10-07 11:02:03 EDT
Patch posted http://lists.rpm.org/pipermail/rpm-maint/2016-October/004578.html
Comment 3 Mark Wielaard 2016-10-12 09:05:03 EDT
Patch accepted upstream http://rpm.org/gitweb?p=rpm.git;a=commit;h=bd7611151b59dfe2f0feb41e478799b5bc26a391
Comment 4 Panu Matilainen 2016-11-09 00:20:17 EST
Fixed in rawhide as of rpm 4.13.0, and going into stable versions too via updates-in-progress.

Note You need to log in before you can comment on or make changes to this bug.