Bug 165514 - incorrect location information for prologue region
incorrect location information for prologue region
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: gcc (Show other bugs)
rawhide
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Jakub Jelinek
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-08-09 20:29 EDT by Roland McGrath
Modified: 2007-11-30 17:11 EST (History)
2 users (show)

See Also:
Fixed In Version: 4.0.1-10
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-08-25 18:30:14 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
test case (231 bytes, text/plain)
2005-08-09 20:29 EDT, Roland McGrath
no flags Details
gcc41-dwarf2-usefbreg.patch (4.33 KB, patch)
2005-08-10 04:29 EDT, Jakub Jelinek
no flags Details | Diff
gcc41-vartrack-fp-prologue.patch (4.97 KB, patch)
2005-08-10 10:15 EDT, Jakub Jelinek
no flags Details | Diff
gcc41-dwarf2out-prefer-1elt-vartracking.patch (1.39 KB, patch)
2005-08-10 11:38 EDT, Jakub Jelinek
no flags Details | Diff
First attempt at location list creation from cfi data (21.44 KB, patch)
2005-08-11 20:11 EDT, Richard Henderson
no flags Details | Diff
Fix for "First attempt at location list creation from cfi data" (2.35 KB, patch)
2005-08-16 12:19 EDT, Jakub Jelinek
no flags Details | Diff
Alternative fix for "First attempt at location list creation from cfi data" (2.03 KB, patch)
2005-08-16 12:21 EDT, Jakub Jelinek
no flags Details | Diff
Second try for cfa-based location list (36.20 KB, patch)
2005-08-18 18:09 EDT, Richard Henderson
no flags Details | Diff

  None (edit)
Description Roland McGrath 2005-08-09 20:29:20 EDT
Description of problem:

The location expressions produced for local variables and parameters are
incorrect before the prologue completes, but there is no indication in the form
of a location list.

Version-Release number of selected component (if applicable):

gcc-4.0.1-6

How reproducible:
100%

Steps to Reproduce:
1. gcc -g -O2 -o foobar foobar.c
2.
3.
  
Actual results:


Expected results:


Additional info:

Trivial test case attached.  This demonstrates the problem for stack locals and
stack-passed arguments.  In the function "foo", we get:

 [    d0]    subprogram
             external
             name                 "foo"
             decl_file            1
             decl_line            3
             prototyped
             type                 [    3b]
             low_pc               0x080483d4
             high_pc              0x080483ee
             frame_base           1 byte block
              [   0] reg5
 [    e7]      formal_parameter
               name                 "x"
               decl_file            1
               decl_line            2
               type                 [    3b]
               location             2 byte block
                [   0] breg5 8
 [    f3]      formal_parameter
               name                 "y"
               decl_file            1
               decl_line            2
               type                 [    3b]
               location             1 byte block
                [   0] reg0

i.e., no location lists, so each expression implicitly covers the range from
low_pc to high_pc.  For the prologue instructions, the expressions are
incorrect.  Note that for stack locations, it would suffice to use fbreg
expressions and have the frame_base give a location list that uses sp instead of
fp during the prologue code.  This may be more compact, and perhaps easier to
generate, than a location list for each variable that computes differently
during the prologue code.

I recall seeing similar bugginess when location lists are generated rather than
single expressions.  I haven't stumbled across the trivial test case for that.
Comment 1 Roland McGrath 2005-08-09 20:29:20 EDT
Created attachment 117591 [details]
test case
Comment 2 Jakub Jelinek 2005-08-10 03:54:27 EDT
I think 2 changes are needed:
1) generate DW_AT_frame_base location list even for -fno-omit-frame-pointer
   code, if the frame pointer is not constant through the entire function.
   With -fomit-frame-pointer, correct DW_AT_frame_base location list is generated
   which covers the whole function
2) for NOTE_INSN_VAR_LOCATION's involving stack pointer or frame pointer,
   if they are live accross prologues or epilogues, transform them into 
   DW_OP_fbreg rather than DW_OP_breg*.  Now, based_loc_descr attempts to do 
   that:
  if (reg == fp_reg && can_use_fbreg)
    loc_result = new_loc_descr (DW_OP_fbreg, offset, 0);
    so the question is just why this is not working.
Comment 3 Jakub Jelinek 2005-08-10 04:29:22 EDT
Created attachment 117598 [details]
gcc41-dwarf2-usefbreg.patch

Seems loc_descriptor_from_tree_1 calls loc_descriptor with can_use_fbreg
set to false unconditionally, which would explain 2).
Comment 4 Jakub Jelinek 2005-08-10 07:17:04 EDT
For 1) I think var-tracking.c needs to be enhanced to track setting of frame
pointer in the prologue if frame_pointer_needed and perhaps reuse the stack
pointer tracking code to find out the ___frame_base_decl locations in the
prologue and eventually in the epilogue as well.
Comment 5 Jakub Jelinek 2005-08-10 10:15:04 EDT
Created attachment 117605 [details]
gcc41-vartrack-fp-prologue.patch

Patch I have been playing with for 1).

With both patches I get following debug info for the foo routine (compiled with
-m32 -O2 -g -dA on x86-64):
 <1><d6>: Abbrev Number: 13 (DW_TAG_subprogram)
     DW_AT_external    : 1
     DW_AT_name        : foo
     DW_AT_decl_file   : 1
     DW_AT_decl_line   : 3
     DW_AT_prototyped  : 1
     DW_AT_type        : <3d>
     DW_AT_low_pc      : 0x70
     DW_AT_high_pc     : 0x8c
     DW_AT_frame_base  : 0xc0	(location list)
 <2><ef>: Abbrev Number: 5 (DW_TAG_formal_parameter)
     DW_AT_name        : x
     DW_AT_decl_file   : 1
     DW_AT_decl_line   : 2
     DW_AT_type        : <3d>
     DW_AT_location    : 2 byte block: 91 8	(DW_OP_fbreg: 8)
 <2><fb>: Abbrev Number: 5 (DW_TAG_formal_parameter)
     DW_AT_name        : y
     DW_AT_decl_file   : 1
     DW_AT_decl_line   : 2
     DW_AT_type        : <3d>
     DW_AT_location    : 1 byte block: 50	(DW_OP_reg0)
 <2><106>: Abbrev Number: 10 (DW_TAG_variable)
     DW_AT_name        : s
     DW_AT_decl_file   : 1
     DW_AT_decl_line   : 4
     DW_AT_type        : <bf>
     DW_AT_location    : 3 byte block: 91 80 7f 	(DW_OP_fbreg: -128)

Contents of the .debug_loc section:
...
    000000c0 00000070 00000071 (DW_OP_breg4: -4)
    000000c0 00000071 00000073 (DW_OP_breg4: 0)
    000000c0 00000073 0000008c (DW_OP_breg5: 0)
    000000c0 <End of list>

DW_OP_breg{4,5} + offset matches what is -fvar-tracking emitting for -m32 -O2
-fomit-frame-pointer code, is that correct? Couldn't e.g. DW_OP_breg{4,5}: 0
be replaced with DW_OP_reg{4,5}?
Now, above DW_OP_reg0 for y sounds definitely wrong, no idea where it came up
from yet, but this comes up with both stock 4.0.1-RH, HEAD and HEAD patched
with
these 2 patches.  That must be something in dwarf2out, since e.g. with
unpatched
HEAD I see only 2 var location notes in foo (both at the very beginning of
prologue):
(note 42 1 43 0 ( x (expr_list:REG_DEP_TRUE (mem/i:SI (plus:SI (reg/f:SI 6 bp)
	    (const_int 8 [0x8])) [3 x+0 S4 A32])
    (const_int 0 [0x0]))) NOTE_INSN_VAR_LOCATION)

(note 43 42 7 0 ( y (expr_list:REG_DEP_TRUE (mem/i:SI (plus:SI (reg/f:SI 6 bp)
	    (const_int 12 [0xc])) [3 y+0 S4 A32])
    (const_int 0 [0x0]))) NOTE_INSN_VAR_LOCATION)

so I'd guess it ought to be DW_OP_fbreg: 12.
Comment 6 Jakub Jelinek 2005-08-10 11:29:52 EDT
The problem with "y" variable above is that (unlike e.g. "x") it gets
assigned a hard register for 2 instructions, so the PARM_DECL's DECL_RTL
becomes (reg/v:SI 0 ax [orig:60 y ] [60]).
add_location_or_const_value_attribute only uses the -fvar-tracking gathered info
if it has more than one entry:
  /* If it truly has multiple locations, the first and last node will
     differ.  */
  if (loc_list && loc_list->first != loc_list->last)
...
but in this case it has only one.
I wonder if we shouldn't after:
  /* Try to get some constant RTL for this decl, and use that as the value of
     the location.  */

  rtl = rtl_for_decl_location (decl);
  if (rtl && (CONSTANT_P (rtl) || GET_CODE (rtl) == CONST_STRING))
    {
      add_const_value_attribute (die, rtl);
      return;
    }

attempt to use loc_list->first->var_loc_note if loc_list != NULL.
Comment 7 Jakub Jelinek 2005-08-10 11:38:08 EDT
Created attachment 117612 [details]
gcc41-dwarf2out-prefer-1elt-vartracking.patch

Turns out such code is there, just loc_descriptor_from_tree is preferred over
it.
If we change the order as done in this patch, we get DW_OP_fbreg + 12, which is
correct.

Anyway, I'll stop here and let Richard look at the problem.
Comment 8 Richard Henderson 2005-08-10 16:07:58 EDT
Patches 1 and 3 look good.  I'll approve those now.

As for patch 2, I agree with the goal, but the implementation will pretty much
only work for i386 and similar cisc machines.  In general we'd need something 
as complicated as the CFA tracking code, and I think it would be a mistake to 
replicate this.

According to the dwarf3 draft, it looks like the Right Way for us to represent
this is to use DW_OP_call_frame_cfa in DW_AT_frame_base.  Naturally this will
require changes elsewhere in the toolchain, but it seems like something that 
we'd have wanted to do anyway.

It'll be mildly tricky to get back to the CFA from the register-eliminated rtl
form that we have, but I think we can push virtual_cfa_rtx through the whole
virtual-register instantiation and register-elimination path and come up with
a steady-state offset from either stack_pointer_rtx or hard_frame_pointer_rtx,
and subtract out the offset when emitting the frame-base offset.
Comment 9 Jakub Jelinek 2005-08-10 16:27:38 EDT
Ok, I'll bootstrap/regtest patches 1 and 3 now, will you work on a patch for 2?

I admit I was testing patch 2 just on i386/x86_64, but have been looking at
-dV dumps also on ia64/ppc/ppc64/s390/s390x and they all have a set fp sp+const
insn.
Comment 10 Roland McGrath 2005-08-10 16:57:16 EDT
DW_OP_call_frame_cfa is an ultra-new draft addition to DWARF3.  I was not aware
of it before today.  Probably there are no DWARF consumers anywhere that handle
it.  There is a lot of work on the consumers that would be required to cope with
this.

It is probably feasible in the gcc 4.1 timeframe since the building blocks will
be needed for other purposes anyway.  Right now, the consumers based on elfutils
are far from ready for that; gdb already has the pieces, so supporting it there
is probably not a whole lot of work (and in some months elfutils will be at that
stage as well).  But right now, it's a DWARF feature that does not even exist in
any published spec.  It probably has to be a -f or -g2 or something to enable
emitting this, given that there are other consumers out there too.

Conversely, everything is already prepared to cope with location lists.
So if that solution is possible in the interim, that would be best for now.
If in the long run we need to abandon that and use DW_OP_call_frame_cfa so as
not to hamper the optimizer or complicate compiler maintenance, then that will
be fine later on.
Comment 11 Richard Henderson 2005-08-10 17:26:10 EDT
(In reply to comment #9)
> ia64/ppc/ppc64/s390/s390x ... they all have a set fp sp+const insn.

You only tested small frames.  With larger frames you'll wind up with
the addition split across 2 to 3 instructions.
Comment 12 Richard Henderson 2005-08-10 17:30:15 EDT
(In reply to comment #10)
> Conversely, everything is already prepared to cope with location lists.
> So if that solution is possible in the interim, that would be best for now.

Ok.  I'll see if I can come up with a way to translate the frame-unwind info
for the CFA into a location list.
Comment 13 Richard Henderson 2005-08-11 20:11:02 EDT
Created attachment 117666 [details]
First attempt at location list creation from cfi data

This patch creates a frame_base location list from the CFI data that we've
already collected.  I've started testing on it, but I won't complete it before
leaving for a long weekend in a dozen hours or so.  I'm posting it here for
folks to experiment with and tell me about problems encountered.
Comment 14 Jakub Jelinek 2005-08-16 12:19:07 EDT
Created attachment 117791 [details]
Fix for "First attempt at location list creation from cfi data"

As noted by Roland, 117666 patch creates .debug_loc duplicates and can result
in other problems.  I wrote 2 alternative patches on top of that patch
that result in the same output for Roland's testcase.
Comment 15 Jakub Jelinek 2005-08-16 12:21:59 EDT
Created attachment 117792 [details]
Alternative fix for "First attempt at location list creation from cfi data"

The patch I attached a few seconds ago remembers labels and emits the loc list
entries late, when were are really sure that's what we want to output.
This alternative patch creates the loc list entries early and keeps updating
their ->end labels.
Comment 16 Jakub Jelinek 2005-08-17 07:19:30 EDT
There is another problem in var-tracking.c I think.
The var-tracking.c change removes frame_base_decl setup and computing, which
is surely ok.  But it also removes
frame_stack_adjust = -prologue_stack_adjust ();
in vt_initialize, but frame_stack_adjust has not been removed and is still used
in vt_stack_adjustments.
I haven't studied the code in detail yet, but I'd say either we need it,
in which case we shouldn't remove prologue_stack_adjust function and
just re-add
if (!frame_pointer_needed)
  frame_stack_adjust = -prologue_stack_adjust ();
to vt_initialize.  Or it is not needed, in which case vt_stack_adjustments
and a bunch of other routines would need to go as well.
Comment 17 Richard Henderson 2005-08-17 12:59:53 EDT
You're right.  There's a lot more code that can be deleted.

One could argue that we're losing functionality here, because the CFI 
description isn't as exact as it ought to be.  But I don't see that
keeping the var-tracking code is really the solution to that.

I do recall that Zdenek had some big CFI rewrite, once upon a time.
I don't recall what if anything was wrong with it, or if it was just
such a big change that no one wanted to review it.  Perhaps I'll dig
that up and have a look.

The thing we have to think about now is what we want to do with these
patches.  Given we're in stage3, we'd have to show that there are bugs
being fixed in order to apply it to mainline.  Or we could just keep
it for gcc-4.?-rhl and apply it when 4.2 opens.
Comment 18 Richard Henderson 2005-08-17 13:01:51 EDT
Oh, I forgot to mention -- I prefer the first of Jakub's two follow-on
fixes.  Plus, I did finally get finished testing vs gdb, and didn't see
any new failures.
Comment 19 Jakub Jelinek 2005-08-17 18:01:05 EDT
From what I can see, var-tracking.c doesn't use the stack tracking code just
for frame_base (which would mean it is unnecessary now with Richard's patch),
but also to adjust memory locations involving stack_pointer_rtx, see
bb_stack_adjust_offset and how it calls adjust_stack_reference.
If we delete the stack tracking code, either they would stay unadjusted
(which would mean incorrect debug info), or we would need to find a way how
to map back frame base location info (converted from gathered CFA) back to
basic block's microoperations var-tracking.c uses.

Once the patch is finalized, I guess we should mail it to gcc-patches in any
case and probably apply to gcc-4_0-rhl-branch and later on rhl-4_1-branch as well
and queue it up for 4.2.
Comment 20 Richard Henderson 2005-08-17 19:53:05 EDT
I can see that it does that, but not why.  It's not like eliminate_regs
knows about complete stack variability, so why would we wind up with
DECL_RTL with variability in it?  This *looks* way over-engineered.

I guess I'll see if I can trigger some ugly cases with values live in
memory and registers over different regions, with -fomit-frame-pointer
and -fno-accumulate-outgoing-args...
Comment 21 Richard Henderson 2005-08-17 22:07:39 EDT
Ug.  Of course the existing var-tracking code never bothers with DECL_RTL,
merely examines the rtl stream and takes whatever's there.  I appears to
create its own CFA-like thing just below the dynamic extent of the stack
pointer, so that all offsets are positive from that point.

And since it's not using the value from DECL_RTL, and the fact that reload
did take the variable stack pointer into account, that leaves us with 
bogus debug data -- regardless of whether the var-tracking stack offset
code is removed or not.  The assumption made by based_loc_descr that we
receive only values relative to the stable-state frame pointer has been
violated.

I'll have to give this some thought.
Comment 22 Richard Henderson 2005-08-18 18:09:55 EDT
Created attachment 117890 [details]
Second try for cfa-based location list

A rearrangement wherein we're sure we're dealing with canonical frame
addresses and offsets.	We delay register elimination until the very
last moment; var-tracking rewrites addresses based on arg_pointer_rtx.

AFAICS so far, this creates valid data for everything except for the
epilogue.  The next task is to truncate location lists referring to
memory at the beginning of the epilogue.
Comment 23 Jakub Jelinek 2005-08-19 12:35:46 EDT
A little bit off-topic, the following testcase on x86_64 -O2 -g -dA shows
a bunch of missing DW_AT_location, where at least in some cases GCC perhaps could
supply them.  Not sure how hard would that be.
/* { dg-do compile } */
/* { dg-options "-O2 -g -dA" } */

struct A
{
  int a;
};
int vari;

/* fn1arg2 here doesn't have DW_AT_location, although it becomes
   alias to fn2arg2.  */
static int
fn1 (struct A *fn1arg1, int fn1arg2, int fn1arg3)
{
  if (fn1arg1->a)
    return fn1arg2;
  switch (fn1arg2)
    {
    case 19:
    case 22:
      if (fn1arg3)
        return 23;
      return 22;
    case 20:
      return 23;
    }
  return fn1arg2;
}

unsigned int
fn2 (struct A *fn2arg1, unsigned int fn2arg2, unsigned long fn2arg3,
     void *fn2arg4)
{
  fn2arg2 = fn1 (fn2arg1, fn2arg2, fn2arg4 == ((void *) 0));
  if (fn2arg2 == (fn2arg3 & 0xffffffff))
    fn2arg2 = 64;
  return fn2arg2;
}

int fn3 (void)
{
  extern void fn4 (int);
  /* varj has no DW_AT_location, although at least for a few insns it
     it could be said to live in vari.  */
  int varj = vari;
  fn4 (varj);
  return 0;
}

int fn5 (void)
{
  extern int fn6 (void);
  int vark = fn6 ();
  /* varl has no DW_AT_location, although it is all the time an
     alias to varl.  */
  int varl = vark;
  fn3 ();
  return varl;
}
Comment 24 Jakub Jelinek 2005-08-25 18:30:14 EDT
The patch is in gcc-4.0.1-10, closing.

Note You need to log in before you can comment on or make changes to this bug.