Bug 1031208

Summary: subprogram DIE has 0 low_pc
Product: [Fedora] Fedora Reporter: Josh Stone <jistone>
Component: gccAssignee: Jakub Jelinek <jakub>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 19CC: jakub, law, mjw, mpolacek, patrickm
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-02-17 19:15:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
save-temps for main.cxx and session.cxx
none
diff of just _M_insert_aux asm none

Description Josh Stone 2013-11-15 21:56:53 UTC
Description of problem:
The stap binary (in systemtap) has debuginfo which reports several subprogram instances as having a 0 low_pc.  Every child DIE of this subprogram is also 0.  I can also reproduce this on a local systemtap.git compile, so that rules out debuginfo stripping or dwz as the culprit.

Version-Release number of selected component (if applicable):
systemtap-debuginfo-2.4-1.fc19.x86_64
gcc-4.8.2-1.fc19.x86_64

How reproducible:
100%

Steps to Reproduce:
1. eu-readelf -N -winfo /usr/lib/debug/usr/bin/stap.debug \
   | grep -E '^ {13}low_pc.*0{16}' -c

Actual results:
96
The first example is:
 [ bc1a3]    subprogram
             specification        (ref_udata) [ 9618e]
             low_pc               (addr) +000000000000000000
             high_pc              (udata) 873 (+0x0000000000000369)
             frame_base           (exprloc) 
              [   0] call_frame_cfa
             object_pointer       (ref_udata) [ bc1b8]
             GNU_all_call_sites   (flag_present) Yes
             sibling              (ref_udata) [ bcce5]

Expected results:
0

Additional info:
I examined that first [bc1a3] a bit.  This is vector<string>::_M_insert_aux (_ZNSt6vectorISsSaISsEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPSsS1_EERKSs).

That DIE is referenced by several GNU_call_sites -- here's the first:
 [ bcd9e]      GNU_call_site
               low_pc               (addr) +0x000000000002dfbe
               abstract_origin      (ref_udata) [ bc1a3]
               sibling              (ref_udata) [ bcdbb]

Disassembled:
   2dfb9:    e8 92 fb ff ff           callq   0x2db50
   2dfbe:    eb d8                    jmp     0x2df98

There are a few subprograms which cover 2db50, like:
 [ 82247]    subprogram
             specification        (ref_udata) [ 6100f]
             low_pc               (addr) +0x000000000002db50
             high_pc              (udata) 1033 (+0x000000000002df59)
             frame_base           (exprloc) 
              [   0] call_frame_cfa
             object_pointer       (ref_udata) [ 8225c]
             GNU_all_call_sites   (flag_present) Yes
             sibling              (ref_udata) [ 82eac]

They're also vector<string>::_M_insert_aux, as you'd expect, and all in different compile_units than [bc1a3].  However, all these good copies have high_pc=1033, whereas [bc1a3] has high_pc=873.  So I'm guessing the linker was happy to merge the functions from different CUs regardless of size, but gave up on the debuginfo where size mismatched.

Comment 1 Josh Stone 2013-11-15 23:23:28 UTC
Created attachment 824745 [details]
save-temps for main.cxx and session.cxx

main.cxx has the larger _M_insert_aux; session.cxx is the one that ultimately gets _M_insert_aux low_pc=0.

AFAICT in the .s, main.cxx chose to inline a couple calls to _ZNSs4_Rep10_M_disposeERKSaIcE, where you can see it directly comparing the string to _ZNSs4_Rep20_S_empty_rep_storageE, but session.cxx didn't inline that.  It's not the only difference, but seems the most significant.

Comment 2 Josh Stone 2013-11-15 23:40:49 UTC
Created attachment 824758 [details]
diff of just _M_insert_aux asm

Here's just _M_insert_aux extracted from .s and diffed, to hopefully make it more clear where I'm talking about _M_dispose inlines.

Comment 3 Josh Stone 2013-11-15 23:45:39 UTC
Anyway, _M_insert_aux is just one of almost 100 instances of low_pc=0 subprograms, so maybe it doesn't need so much focus.  The main thing I take from it is that some CUs may optimize the same function differently, and this is apparently breaking the debuginfo for the CU whose version of a function isn't the one linked.

Comment 4 Jakub Jelinek 2013-12-09 17:07:13 UTC
I bet what you are seeing is not what GCC emits, but how the linker handles comdat sections.  If the same comdat symbol (e.g. _ZNSt6vectorISsSaISsEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPSsS1_EERKSs
in your testcase) is emitted into multiple files, then while what GCC emits will have non-zero low_pc or range's base in every CU separately, when the linker picks
one CU from which it will include the specific comdat sections (and from others drop them on the floor), it expectably doesn't rewrite DWARF info.  If the comdat section is the same size (or is it same size/same content even, that would be much better), I think the linker chooses to just relocate the relocations from the discarded comdat sections to corresponding kept comdat sections, but if they are different size, the relocations are resolved to 0 instead.  So, 0 should be a special value for debug info consumers, saying "look for another DIE for this function, this one doesn't really describe what you actually get in the binary/shared library.
Note, before Mark's DW_AT_high_pc changes, both the compiler and linker would emit both low_pc and high_pc as addresses and thus you would get [0, 0] as range, which I guess debug info consumers would not be that upset about.  But with -gdwarf-4 if DW_AT_high_pc is data class rather than addr class, then it will be just the size and thus not cleared by the linker.

I'd say the consumers should treat it the same though, i.e. if DW_AT_high_pc is present and data class, and DW_AT_low_pc is 0 (without relocation for it), then treat it as if the high_pc was 0 too.

Short testcases:
am1.C:
template <int N>
int foo () { return N; }

int bar () { return foo <0> (); }
am2.C:
template <int N>
int foo () { return N; }

int main () { return foo <0> (); }

Now compile it with
g++ -g -dA -c am1.C; g++ -g -dA -c am2.C
g++ -g -dA -c -O2 -fno-inline am2.C -o am3.o
g++ -o am1 am1.o am2.o
g++ -o am2 am1.o am3.o
Now, presumably in am1 the comdat section will have the same content, so you have a testcase for the linker behavior where it adjust relocations from discarded sections to corresponding offsets in kept section.
And, in am2 supposedly the section size will be different (ditto content), so you get zeros for DW_AT_low_pc.  Now repeat the same with -gdwarf-3 instead of -g and you can see DW_AT_high_pc with DW_FORM_addr behavior, rather than DW_FORM_data[48].

Comment 5 Josh Stone 2013-12-09 19:41:24 UTC
I think it's unfortunate to push this onto DWARF consumers.  It's a feasible heuristic to ignore these pc=0 DIEs, and systemtap does ignore them, but I feel the linker ought to do better.  However, I do understand that this is difficult and unlikely to change any time soon.

FWIW, ld.bfd and ld.gold both act the same with your example.

Comment 6 Fedora End Of Life 2015-01-09 20:36:16 UTC
This message is a notice that Fedora 19 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 19. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 19 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 7 Fedora End Of Life 2015-02-17 19:15:42 UTC
Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.