Bug 1380961

Summary: eu-strip on rust libraries breaks dynamic symbols
Product: [Fedora] Fedora Reporter: Josh Stone <jistone>
Component: elfutilsAssignee: Mark Wielaard <mjw>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: aoliva, fche, jakub, me, mjw, mjw, roland
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: elfutils-0.167-2.fc25 elfutils-0.168-1.fc24 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1382394 (view as bug list) Environment:
Last Closed: 2017-01-11 12:18:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1382394    
Attachments:
Description Flags
original libstd-40393716.so
none
libstd after eu-strip
none
libstd after binutils strip -R .rustc none

Description Josh Stone 2016-10-02 00:33:47 UTC
Description of problem:
The eu-strip call in find-debuginfo.sh is breaking rust libraries.  In addition to normal debug data, it is removing the .rustc metadata section.  That's fine, since Rust has no stable ABI, I don't want folks linking to these.  But now the library's own dynamic symbols no longer resolve, as seen by "ldd -r".

Version-Release number of selected component (if applicable):
elfutils-0.167-1.fc26.x86_64

How reproducible:
100%

Steps to Reproduce:
1. eu-strip libstd-40393716.so
2. ldd -r libstd-40393716.so

Actual results:
Hundreds of "undefined symbol: _Z...."

Expected results:
Clean ldd output.

Additional info:
You can see a broken result in the build of rust-1.12.0-1.fc26:
http://koji.fedoraproject.org/koji/buildinfo?buildID=805906

I will also attach the libstd mentioned above, before and after stripping.

I tried with binutils, also removing the metadata, and it's fine:
  strip -R .rustc libstd-40393716.so

In the rpm, I tried "objcopy -R .rustc" at the end of %install, so the metadata is already gone by the time eu-strip sees the files, but they still get broken.

I set my rpm to use "eu-strip -g" for just debuginfo stripping, which gets me past this for now.  That leaves the full symbol table and .rustc in place, so the libraries are bigger than they need to be, but it works.

Comment 1 Josh Stone 2016-10-02 00:37:33 UTC
Created attachment 1206577 [details]
original libstd-40393716.so

Comment 2 Josh Stone 2016-10-02 00:38:27 UTC
Created attachment 1206578 [details]
libstd after eu-strip

Comment 3 Josh Stone 2016-10-02 00:39:52 UTC
Created attachment 1206579 [details]
libstd after binutils strip -R .rustc

Comment 4 Josh Stone 2016-10-02 00:55:41 UTC
The main difference between binutils and elfutils appears to be regarding .dynsym entry 1550 "rust_metadata_std_51f4f7aef635fcd3" which is basically the entire .rustc section.  Binutils left this entry alone, but eu-strip removed it.

If you're curious, here's the code where rustc writes that section.  Creating it as a global appears to be just an easy way to get LLVM to write arbitrary data into a section of rustc's choosing.
https://github.com/rust-lang/rust/blob/3191fbae9da539442351f883bdabcad0d72efcb6/src/librustc_trans/base.rs#L2248

Comment 5 Mark Wielaard 2016-10-02 10:09:09 UTC
The .rustc section is a non-allocated section, which indicates it isn't needed at runtime, which is why it is moved to the separate debuginfo file.

The rust_metadata_std_51f4f7aef635fcd3 symbol should be in the .symtab section, not the .dynsym table, because it refers to a non-allocated section (29) while being in the .dynsym table, which only contains references to symbols needed at runtime in allocated sections.

Comment 6 Josh Stone 2016-10-02 16:29:20 UTC
(In reply to Mark Wielaard from comment #5)
> The .rustc section is a non-allocated section, which indicates it isn't
> needed at runtime, which is why it is moved to the separate debuginfo file.

That's fine, although minidebuginfo will just put it back into .gnu_debugdata, and .rustc is already compressed so there's no space savings there.  If possible I'd rather strip it altogether, because it's not useful for debuginfo either.

There may come a day though, if Rust gets a stable ABI, where we *will* want to keep this data around.  It's unallocated because it's only needed at compile-time linking, so I think eu-strip is too eager to assume everything unallocated should be stripped.  Then 'eu-strip -g' is too far the other way, since that leaves all the symbol tables too.

We may need a middle ground, to tell eu-strip just to remove sections that are known unnecessary.  That seems to be how binutils' strip works, as it doesn't touch .rustc unless I explicitly use -R.

There's also an open Rust bug to move the metadata into separate files, which may make all of this moot for elfutils.

> The rust_metadata_std_51f4f7aef635fcd3 symbol should be in the .symtab
> section, not the .dynsym table, because it refers to a non-allocated section
> (29) while being in the .dynsym table, which only contains references to
> symbols needed at runtime in allocated sections.

That's logical, and perhaps I'll see if I can get LLVM to emit it into .symtab only.  It's actually fine with me if eu-strip wants to remove it from .dynsym, but it's breaking a lot of other symbols in the process, so its method appears to be unsound.  Perhaps it would be better to just zero-out that entry?

Comment 7 Mark Wielaard 2016-10-02 22:47:58 UTC
(In reply to Josh Stone from comment #6)
> (In reply to Mark Wielaard from comment #5)
> > The .rustc section is a non-allocated section, which indicates it isn't
> > needed at runtime, which is why it is moved to the separate debuginfo file.
> 
> That's fine, although minidebuginfo will just put it back into
> .gnu_debugdata, and .rustc is already compressed so there's no space savings
> there.

It seems wrong if minidebuginfo adds the whole .rustc section. It is only supposed to create a compressed (skeleton) ELF image with a mini-symtab table that contains only function symbols.

>  If possible I'd rather strip it altogether, because it's not useful
> for debuginfo either.

Yes, I am surprised rpmbuild doesn't seem to do that.
 
> There may come a day though, if Rust gets a stable ABI, where we *will* want
> to keep this data around.  It's unallocated because it's only needed at
> compile-time linking, so I think eu-strip is too eager to assume everything
> unallocated should be stripped.  Then 'eu-strip -g' is too far the other
> way, since that leaves all the symbol tables too.

I am not sure that linking against non-allocated symbols in ET_DYN files really makes sense.

> We may need a middle ground, to tell eu-strip just to remove sections that
> are known unnecessary.  That seems to be how binutils' strip works, as it
> doesn't touch .rustc unless I explicitly use -R.

I am not sure binutils strip really follows the ELF spec here. When we strip an ELF executable then the only thing we want to keep is what is necessary at runtime. But we do have some special rules already in eu-strip for things like the .comment section. So maybe we could come up with something like that for other sections if necessary.

> There's also an open Rust bug to move the metadata into separate files,
> which may make all of this moot for elfutils.

What does this metadata contain?
Maybe there is a reasonable way to express it as ELF data.

> It's actually fine with me if eu-strip wants to remove it
> from .dynsym, but it's breaking a lot of other symbols in the process, so
> its method appears to be unsound.  Perhaps it would be better to just
> zero-out that entry?

Aha. I see some of the other symbols aren't like the one you mentioned.
Those are actually symbols for allocated data used in RELA sections.
That does indeed seem wrong to strip away.

I'll take a look at what is going on here.

Comment 8 Mark Wielaard 2016-10-03 15:26:53 UTC
(In reply to Mark Wielaard from comment #7)
> Aha. I see some of the other symbols aren't like the one you mentioned.
> Those are actually symbols for allocated data used in RELA sections.
> That does indeed seem wrong to strip away.
> 
> I'll take a look at what is going on here.

We don't strip away those symbols. But because this is the .dynsym table that we adjust (which we normally don't have to do because there are normally no "strippable" symbols in there that point to some unallocated section) we also have to adjust the gnu hash table. Which we don't do. ldd relies on the .gnu.hash section to resolve the dynsym symbols and so reports those symbols as missing...

Comment 9 Josh Stone 2016-10-03 17:48:10 UTC
(In reply to Mark Wielaard from comment #7)
> It seems wrong if minidebuginfo adds the whole .rustc section. It is only
> supposed to create a compressed (skeleton) ELF image with a mini-symtab
> table that contains only function symbols.

add_minidebug() runs:

  objcopy -S --remove-section .gdb_index --remove-section .comment --keep-symbols="$keep_symbols" "$debuginfo" "$mini_debuginfo" &> /dev/null

But objcopy's idea of strip all (-S) does not include unknown unallocated sections the way eu-strip does.  So eu-strip pulls .rustc out into the debuginfo, and objcopy preserves .rustc when copying back.

I suppose this line could add .rustc as another explicit removal, like .gdb_index and .comment, but I still rather think eu-strip is being overly aggressive in the first place.

> >  If possible I'd rather strip it altogether, because it's not useful
> > for debuginfo either.
> 
> Yes, I am surprised rpmbuild doesn't seem to do that.

My preference would be to remove .rustc as a manual step though.  I don't expect the tools to decide whether this is needed, because those tools have no idea what this section is for.  When in doubt, leave it alone!

But I can't remove the section manually, because later eu-strip still removes the related .dynsym entry, breaking dynamic symbols in the same way.

> > There may come a day though, if Rust gets a stable ABI, where we *will* want
> > to keep this data around.  It's unallocated because it's only needed at
> > compile-time linking, so I think eu-strip is too eager to assume everything
> > unallocated should be stripped.  Then 'eu-strip -g' is too far the other
> > way, since that leaves all the symbol tables too.
> 
> I am not sure that linking against non-allocated symbols in ET_DYN files
> really makes sense.

Nothing is linking into .rustc -- I phrased that poorly.  I should just say that it's only needed at compilation time for code that wants to use this library, but that's actually in codegen well before the linking step.

> > We may need a middle ground, to tell eu-strip just to remove sections that
> > are known unnecessary.  That seems to be how binutils' strip works, as it
> > doesn't touch .rustc unless I explicitly use -R.
> 
> I am not sure binutils strip really follows the ELF spec here. When we strip
> an ELF executable then the only thing we want to keep is what is necessary
> at runtime. But we do have some special rules already in eu-strip for things
> like the .comment section. So maybe we could come up with something like
> that for other sections if necessary.

Does the spec describe strip?  I wholly agree that .rustc isn't needed at runtime, but that doesn't mean it's not needed at all.

> > There's also an open Rust bug to move the metadata into separate files,
> > which may make all of this moot for elfutils.
> 
> What does this metadata contain?
> Maybe there is a reasonable way to express it as ELF data.

I guess it's roughly analogous to a C++ header, especially one with templates.  It contains the AST of exported interfaces, as well as implementations for any inline-able functions and generic code that has yet to be monomorphized.

(In reply to Mark Wielaard from comment #8)
> (In reply to Mark Wielaard from comment #7)
> > Aha. I see some of the other symbols aren't like the one you mentioned.
> > Those are actually symbols for allocated data used in RELA sections.
> > That does indeed seem wrong to strip away.
> > 
> > I'll take a look at what is going on here.
> 
> We don't strip away those symbols. But because this is the .dynsym table
> that we adjust (which we normally don't have to do because there are
> normally no "strippable" symbols in there that point to some unallocated
> section) we also have to adjust the gnu hash table. Which we don't do. ldd
> relies on the .gnu.hash section to resolve the dynsym symbols and so reports
> those symbols as missing...

Right - only the one useless "rust_metadata" symbol is actually stripped, but many other symbols fail as a result.  And note that "ldd relies..." really means "ld.so relies ...", so these libraries are unusable as a result.

Comment 10 Roland McGrath 2016-10-03 22:14:07 UTC
For "full" stripping (default mode), it's right to aggressively remove things.  binutils is sloppy here, and people have grown bad habits because of it.
The basic rule is "if it's not needed at runtime per ELF spec, it gets removed"; that approximately means only SHF_ALLOC sections should be kept.  We have two exceptions to this:

1. SHT_NOTE sections are kept (and also kept in separate debug files).  IIRC we did this originally specifically for the build ID note; it's kept in stripped files anyway because it's SHF_ALLOC, but we need it in the separate debug files even though it's SHF_ALLOC.  But this is also now the catch-all for stuff you wanted kept: put it in a SHT_NOTE section.

2. .gnu.warning.* sections.  These are kept because they are needed (well, useful) in stripped DSOs used at static link time.  If we were inventing .gnu.warning today, I think we'd make it an ELF note instead of a magically-named section.

I think for any case of "it's not part of the runtime image, but I want you not to strip it for some mumblefritz reason" the answer today should be "make it SHT_NOTE or GTFO".

Comment 11 Josh Stone 2016-10-03 22:57:39 UTC
Funny thing is that up to Rust 1.11 it did use a note, but unnecessarily marked allocated, and that may lead to a huge ld.so alloca:
https://sourceware.org/bugzilla/show_bug.cgi?id=20419

Maybe the best choice would a note that's not alloc.  But this metadata is large, and we really don't want it duplicated.  In the case of libstd, the metadata is bigger than either the .text or the .debug_info!  So I would not like it copied to the debug files, and certainly not copied back into the .gnu_debugdata.

This is academic for now and the foreseeable future, because I don't want to expose any unstable Rust ABI to the distro.  For now, I would be happy wiping all signs of .rustc from existence.  We only need that data while compiling rustc itself, which does dynamically link its Rust libraries, but we should nuke it after.

FWIW, here's the Rust ABI issue: https://github.com/rust-lang/rfcs/issues/600
And for a separate metadata file: https://github.com/rust-lang/rust/issues/23366


Regardless of what *Rust* should do, there's a real bug in how eu-strip is removing the .dynsym entry.  Either it should update related sections like .gnu.hash, or find a different way to deal with that dangling entry, or just leave it alone.  Mark and I discussed on IRC that we could leave that entry in place and mark its section index UNDEF.

Comment 12 Roland McGrath 2016-10-03 23:38:25 UTC
I think it's always a linker bug if .dynsym has an entry whose st_shndx is a section without SHF_ALLOC.  That said, eu-strip coping with such buggy input in a "least harm" way seems best.  And the other acceptable alternative would be for it to barf with an error and refuse to produce output from such bogus input.  But given the known existence of this linker bug, resetting its st_shndx to SHN_UNDEF is probably OK.  OTOH, it's entirely possible that this confuses some other tool, because it makes it look like there's a dangling undefined reference in the DSO (to things that look at all of .dynsym instead of just processing relocs like normal loading does).  For example, do all ldd modes (-v, -u, ?) handle it without getting confused?

Comment 13 Josh Stone 2016-10-04 00:46:18 UTC
I manually edited that st_shndx, and I get this output from eu-elflint --gnu-ld:

> section [ 2] '.gnu.hash': symbol 1550 referenced in chain for bucket 479 is undefined

I can't find any ldd invocation that cares, FWIW.

Comment 14 Mark Wielaard 2016-10-06 14:25:58 UTC
(In reply to Josh Stone from comment #13)
> I manually edited that st_shndx, and I get this output from eu-elflint
> --gnu-ld:
> 
> > section [ 2] '.gnu.hash': symbol 1550 referenced in chain for bucket 479 is undefined

I think it is reasonable for eu-elflint to complain about that.

> I can't find any ldd invocation that cares, FWIW.

Thanks for testing that.
I made a patch for eu-strip that does warn and set the symbol section to UNDEF.
https://lists.fedorahosted.org/archives/list/elfutils-devel@lists.fedorahosted.org/message/U2NSIOZ5CM6URT5Z6CR2SG733QJXTQ5N/

If it is convenient for you I can create a rawhide fedora package with that.

Comment 15 Josh Stone 2016-10-06 19:17:35 UTC
The upstream patch seems to work, at least for manual tests.  If you'd like to throw up a rawhide build, even scratch, I'll try a full rust rpmbuild in mock.

Comment 16 Mark Wielaard 2016-10-07 16:06:39 UTC
(In reply to Josh Stone from comment #15)
> The upstream patch seems to work, at least for manual tests.  If you'd like
> to throw up a rawhide build, even scratch, I'll try a full rust rpmbuild in
> mock.

rawhide now has elfutils-0.167-2.fc26 with the proposed fix in it.

Comment 17 Josh Stone 2016-10-10 01:56:09 UTC
Looks good in rust-1.12.0-5.fc26:
http://koji.fedoraproject.org/koji/buildinfo?buildID=807889

Comment 18 Josh Stone 2016-10-10 19:07:17 UTC
Saw this nice gem in the latest rawhide compose report:

Package:      rust-1.12.0-5.fc26
Old package:  rust-1.12.0-4.fc26
Summary:      The Rust Programming Language
RPMs:         rust rust-doc rust-gdb
Size:         122879002 bytes
Size change:  -82866216 bytes

That size reduction is the sum across all 4 archs.  By trimming .rustc, they all went from about 45MB to 25MB for the core rust package!

Comment 19 Fedora Update System 2016-10-13 05:52:34 UTC
elfutils-0.167-2.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-c47021eebe

Comment 20 Fedora Update System 2016-10-14 04:59:50 UTC
elfutils-0.167-2.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.

Comment 21 Fedora Update System 2017-01-01 13:37:20 UTC
elfutils-0.168-1.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2017-6c9a495a48

Comment 22 Fedora Update System 2017-01-02 01:50:34 UTC
elfutils-0.168-1.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-6c9a495a48

Comment 23 Fedora Update System 2017-01-11 12:18:58 UTC
elfutils-0.168-1.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.