Bug 169485

Summary: bad DWARF info: frame_base wrong
Product: Red Hat Enterprise Linux 4 Reporter: Roland McGrath <roland>
Component: gccAssignee: Jakub Jelinek <jakub>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: aoliva, ezannoni, fche, poelstra
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2006-0125 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-03-07 18:46:04 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 168429, 176182    

Description Roland McGrath 2005-09-28 19:20:37 UTC
Description of problem:
See kernel-debuginfo-2.6.9-17.EL.i686.rpm, or later U2 builds.
sys_time debug info for parameter "tloc" gives location list:

 [ 44d34]  0000000000..0x00000016 [   0] fbreg 20
           0x00000016..0x00000056 [   0] reg3

the frame_base location list:

[ 44cc0]  0000000000..0x00000001 [   0] breg4 -8
           0x00000001..0x00000002 [   0] breg4 -4
           0x00000002..0x00000003 [   0] breg4 0
           0x00000003..0x00000004 [   0] breg4 4
           0x00000004..0x00000052 [   0] breg4 8
           0x00000052..0x00000055 [   0] breg4 4
           0x00000055..0x00000056 [   0] breg4 0
           0x00000056..0x00000057 [   0] breg4 -4
           0x00000057..0x00000058 [   0] breg4 -8

Here is the code, 0xc0125d90 is the CU base address:

c0125d90 <sys_time>:
c0125d90:       56                      push   %esi
c0125d91:       53                      push   %ebx
c0125d92:       53                      push   %ebx
c0125d93:       53                      push   %ebx
c0125d94:       8b 5c 24 14             mov    0x14(%esp),%ebx
c0125d98:       89 e0                   mov    %esp,%eax
c0125d9a:       e8 01 6c fe ff          call   c010c9a0 <do_gettimeofday>

It looks like the frame_base calculation is off by two words all along.

Version-Release number of selected component (if applicable):
gcc-3.4.4-2

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Jakub Jelinek 2005-09-30 09:31:07 UTC
Self-contained testcase:
/* { dg-options "-Os -fno-unit-at-a-time -march=i686 -g -fomit-frame-pointer
-fverbose-asm -dA -m32" } */
struct timeval { unsigned long tv_sec; unsigned long tv_usec; };
extern void do_gettimeofday (struct timeval *tv);
struct L { unsigned long buf[100]; };
struct T { int pad[6]; unsigned long addr_limit; };
static inline struct T * current_T (void)
{ struct T *ti; __asm__ ("andl %%esp,%0; ": "=r" (ti) : "0" (~4095)); return ti;
}
long sys_time (int *tloc)
{
  int i;
  struct timeval tv;
  do_gettimeofday (&tv);
  i = tv.tv_sec;
  if (tloc)
      if (({
            long __pu_err = -14;
            __typeof__ (*((tloc))) * __pu_addr = ((tloc));
            if ((__builtin_expect (({unsigned long flag, sum;
                                     asm ("addl %3,%1 ; sbbl %0,%0; cmpl %1,%4;
sbbl $0,%0"
                                          : "=&r" (flag), "=r" (sum)
                                          : "1" (__pu_addr), "g" ((int) (sizeof
(*(tloc)))),
                                            "g" (current_T ()->addr_limit));
                                     flag; }) == 0, 1))) {
              __pu_err = 0;
              __asm__ __volatile__ ("movl %1,%2" : "=r" (__pu_err)
                                    : "ir" (((__typeof__ (*(tloc))) (i))),
                                      "m" ((*(struct L *) (__pu_addr))), "i"
(-14), "0" (__pu_err));
            }
            __pu_err;
           }))
        i = -14;
  return i;
}
int main (void)
{
  return 0;
}
struct timeval tt;
void __attribute__ ((noinline)) do_gettimeofday (struct timeval *x)
{
  __builtin_memcpy (x, &tt, sizeof (tt));
}

Note that gcc-3.4.4-2 doesn't have Richard's frame base tracking stuff.

Comment 3 Frank Ch. Eigler 2005-09-30 13:09:40 UTC
The 3.4.x need comes from wanting to use systemtap on RHEL4.  Much of the new
dwarf stuff would be excellent to have backported to 3.4.x, but the effort needs
to be estimated.

Another possibility is to add yet more heuristics to systemtap/elfutils to
detect cases like this.  Is there some tell-tale sign that the compiler might
have screwed up the frame-base values, so that we might be able to compensate
for (or even, just detect) bad data?

Comment 5 Roland McGrath 2005-11-18 02:56:53 UTC
Best would be a modified RHEL4U3 draft gcc rpm with the changes.
With that, I can rebuild the kernel rpms from the original problem reports and
test the real-world problems.

Comment 10 Frank Ch. Eigler 2005-11-22 17:55:38 UTC
I have a 2.6.9-22.24.EL kernel rebuilt with this new compiler.  How can I test
whether this patch is working?  The systemtap script produces results that may
or may not be correct.  I'm looking for a way to get a dump of that frame_base
table from the top of the bug report.

Comment 17 Red Hat Bugzilla 2006-03-07 18:46:04 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0125.html