Description of problem: The glibc backtrace function on x86_64 is both broken and slow. Backtrace addresses seem to point to the next source address after an actual call site, instead of the call site itself. In addition, the backtrace function takes almost 4us, instead of the 40ns it takes on i386. This seems to be because it uses the IA64 "portable" version of backtrace unnecessarily, when the i386 version compiled on x86_64 works fine. Version-Release number of selected component (if applicable): Occurs with every glibc shipped in Fedora Core since FC1.
Created attachment 106936 [details] backtrace.c Compile this program with -Wl,--export-dynamic on i386, and run it. Then compile with the same options on x86_64, and run it.
1) Backtrace is supposed to write the addresses after the actual call size, not sure why you call that a bug. Just look what e.g. gdb is doing when you do a backtrace there. And this is consistent between i386 and x86_64. 2) You are very wrong in thinking that the non-.eh_frame using backtrace can work reliably on x86-64. It can't. x86-64 doesn't use frame pointer in most cases (as per x86-64 ABI) and the i386 backtrace () relies on the frame pointer to be used in all functions in the backtrace. Just look what backtrace prints if you rm -f sysdeps/x86_64/backtrace.c and rebuild glibc with that. For me with -g -O2 -fno-optimize-sibling-calls -Wl,-E your program run against such glibc returns always just the first frame (depth00) and nothing else. -fno-optimize-sibling-calls being there because otherwise depth01 ... depth32 aren't supposed to show up, as all those calls are tail call optimized. You'd need say return depth00 () + 1.0; or something there to avoid that otherwise. Yes, using .eh_frame for backtrace is slower than using the frame pointers. But do you really prefer broken, but fast backtrace () over slower, but correct one? I don't. FYI, the i386 backtrace now uses .eh_frame too and only falls back to frame pointer unwinding as soon as .eh_frame section is not provided for some frame. The reason is to make backtrace () work with -fomit-frame-pointer i386 code, assuming -fasynchronous-unwind-tables or similar option is used too. This is likely to be the default in GCC soon.