Description of problem: See kernel-debuginfo-2.6.9-17.EL.i686.rpm, or later U2 builds. sys_time debug info for parameter "tloc" gives location list: [ 44d34] 0000000000..0x00000016 [ 0] fbreg 20 0x00000016..0x00000056 [ 0] reg3 the frame_base location list: [ 44cc0] 0000000000..0x00000001 [ 0] breg4 -8 0x00000001..0x00000002 [ 0] breg4 -4 0x00000002..0x00000003 [ 0] breg4 0 0x00000003..0x00000004 [ 0] breg4 4 0x00000004..0x00000052 [ 0] breg4 8 0x00000052..0x00000055 [ 0] breg4 4 0x00000055..0x00000056 [ 0] breg4 0 0x00000056..0x00000057 [ 0] breg4 -4 0x00000057..0x00000058 [ 0] breg4 -8 Here is the code, 0xc0125d90 is the CU base address: c0125d90 <sys_time>: c0125d90: 56 push %esi c0125d91: 53 push %ebx c0125d92: 53 push %ebx c0125d93: 53 push %ebx c0125d94: 8b 5c 24 14 mov 0x14(%esp),%ebx c0125d98: 89 e0 mov %esp,%eax c0125d9a: e8 01 6c fe ff call c010c9a0 <do_gettimeofday> It looks like the frame_base calculation is off by two words all along. Version-Release number of selected component (if applicable): gcc-3.4.4-2 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Self-contained testcase: /* { dg-options "-Os -fno-unit-at-a-time -march=i686 -g -fomit-frame-pointer -fverbose-asm -dA -m32" } */ struct timeval { unsigned long tv_sec; unsigned long tv_usec; }; extern void do_gettimeofday (struct timeval *tv); struct L { unsigned long buf[100]; }; struct T { int pad[6]; unsigned long addr_limit; }; static inline struct T * current_T (void) { struct T *ti; __asm__ ("andl %%esp,%0; ": "=r" (ti) : "0" (~4095)); return ti; } long sys_time (int *tloc) { int i; struct timeval tv; do_gettimeofday (&tv); i = tv.tv_sec; if (tloc) if (({ long __pu_err = -14; __typeof__ (*((tloc))) * __pu_addr = ((tloc)); if ((__builtin_expect (({unsigned long flag, sum; asm ("addl %3,%1 ; sbbl %0,%0; cmpl %1,%4; sbbl $0,%0" : "=&r" (flag), "=r" (sum) : "1" (__pu_addr), "g" ((int) (sizeof (*(tloc)))), "g" (current_T ()->addr_limit)); flag; }) == 0, 1))) { __pu_err = 0; __asm__ __volatile__ ("movl %1,%2" : "=r" (__pu_err) : "ir" (((__typeof__ (*(tloc))) (i))), "m" ((*(struct L *) (__pu_addr))), "i" (-14), "0" (__pu_err)); } __pu_err; })) i = -14; return i; } int main (void) { return 0; } struct timeval tt; void __attribute__ ((noinline)) do_gettimeofday (struct timeval *x) { __builtin_memcpy (x, &tt, sizeof (tt)); } Note that gcc-3.4.4-2 doesn't have Richard's frame base tracking stuff.
The 3.4.x need comes from wanting to use systemtap on RHEL4. Much of the new dwarf stuff would be excellent to have backported to 3.4.x, but the effort needs to be estimated. Another possibility is to add yet more heuristics to systemtap/elfutils to detect cases like this. Is there some tell-tale sign that the compiler might have screwed up the frame-base values, so that we might be able to compensate for (or even, just detect) bad data?
Best would be a modified RHEL4U3 draft gcc rpm with the changes. With that, I can rebuild the kernel rpms from the original problem reports and test the real-world problems.
I have a 2.6.9-22.24.EL kernel rebuilt with this new compiler. How can I test whether this patch is working? The systemtap script produces results that may or may not be correct. I'm looking for a way to get a dump of that frame_base table from the top of the bug report.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2006-0125.html