Bug 217087

Summary: Incorrect __restore_rt() unwind info
Product: [Fedora] Fedora Reporter: Jan Kratochvil <jan.kratochvil>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: fweimer
Target Milestone: ---Keywords: Patch
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-11-29 20:08:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Fix of __restore_rt for i386+x86_64 - functype + cfi_signal_frame
none
Test .c file invoking a signal handler
none
Patch update with jakub's advices, it even works now none

Description Jan Kratochvil 2006-11-23 23:08:28 UTC
Description of problem:
While debugging unwinding I found out it was failing fror signals on x86_64
because the signal frame was not identified. The backtraced function
__restore_rt() should be identified as FUNC and it should be 'S'-augmented (so
to be 'signal frame'-augmented).

Found for libunwind: http://sourceware.org/bugzilla/show_bug.cgi?id=3529

vepro.redhat.usu RawHide-x86_64:
354: 000000369c2300f0     0 NOTYPE  LOCAL  DEFAULT   11 __restore_rt

Version-Release number of selected component (if applicable):
glibc-2.5.90-6.x86_64

How reproducible:
Always.

Steps to Reproduce:
1. Login into: vepro.redhat.usu RawHide-x86_64: glibc-2.5.90-6.x86_64
2. readelf -a /lib64/libc.so.6
3. readelf -a --debug-dump=frames /lib64/libc.so.6
  
Actual results:
354: 000000369c2300f0     0 NOTYPE  LOCAL  DEFAULT   11 __restore_rt

Expected results:
354: 000000369c2300f0     0 FUNC    LOCAL  DEFAULT   11 __restore_rt
                            ^^^^
  Augmentation:          "zRS"
                            ^
00001950 00000014 0000001c FDE cie=00001938 pc=0002e2b0..0002e2b9

Additional info:
I do not have completed a testcase as even if I fixed glibc, there are still
bugs in my other code.
Although i386 is not affected as the signal frame is in `__kernel_sigreturn'
there I believe `__restore_rt' should get fixed even for i386.
I did not test compatibility with existing gdb(1) as warned in the source.

Comment 1 Jan Kratochvil 2006-11-23 23:08:28 UTC
Created attachment 142024 [details]
Fix of __restore_rt for i386+x86_64 - functype + cfi_signal_frame

Comment 2 Jakub Jelinek 2006-11-24 09:53:10 UTC
The patch is wrong unfortunately, for multiple reasons:
1) .cfi_signal_frame is much newer than other .cfi_* directives, so you can't
use that unconditionally.
2) if you provide unwind info for the sigreturn pad, you need to provide
correct unwind info, how to restore all registers, not just the default, see
e.g. linux/arch/i386/kernel/vsyscall-sigreturn.S.
3) the FDE range for it must cover at least a byte before the pad, because
usually the IP on the stack below the signal frame points to the first insn
in the pad and as that typically is not another signal frame, the unwinder
subtracts one.
On i?86 there isn't much point to worry about this, on recent kernels the pad
is in vDSO which has the unwind info.
On x86-64 it doesn't matter much either, libgcc_s has MD_FALLBACK_FRAME_STATE_FOR
for x86-64 which handles it just fine.

Comment 3 Jan Kratochvil 2006-11-24 18:24:44 UTC
Thanks for your insightful comments.
(1) OK, .cfi_signal_frame should be ./configure-d.
(2) Understood, CFI to be fixed (using .cfi_* instead of ".section .eh_frame").
(3) Thanks this problem should not be resolved in the debugger but by the frame.

I just do not agree with MD_FALLBACK_FRAME_STATE_FOR - it is applicable for GCC
exceptions unwinding but not for external debugger ptrace(2)-unwinding the
frames. Currently gdb's amd64_linux_sigtramp_p() is using
strcmp("__restore_rt",) hack which should not be needed with .cfi_signal_frame
and cooperative glibc.


Comment 4 Jan Kratochvil 2006-11-24 19:23:01 UTC
Created attachment 142090 [details]
Test .c file invoking a signal handler

Just a confirmation __restore_rt() is still in use on:
glibc-2.5.90-6.x86_64 (/lib64/libc.so.6)
kernel-xen-2.6.18-1.2849.fc6.x86_64
(non-xen kernel not tested)

run;cont;bt ->
#2  <signal handler called>
(gdb) info frame
Stack level 2, frame at 0x7fff546552c8:
 rip = 0x3db2a300f0 in __restore_rt; saved rip 0x4007fc
 called by frame at 0x7fff54655500, caller of frame at 0x7fff546552c0

gcc -DHAVE_CONFIG_H -I../include -I../include -D_GNU_SOURCE -DDEBUG  -ggdb3
-fexceptions -Wall -Wsign-compare -g -fasynchronous-unwind-tables
-fno-exceptions -MMD  -o test-ptrace-stepper.o -c test-ptrace-stepper.c
gcc -pthread -o test-ptrace-stepper test-ptrace-stepper.o

Comment 5 Jan Kratochvil 2006-11-25 18:10:12 UTC
Created attachment 142117 [details]
Patch update with jakub's advices, it even works now

Posted to <libc-alpha.com> upstream:
http://sourceware.org/ml/libc-alpha/2006-11/msg00063.html
Copy:
Frysk debugger in development would like to properly unwind signal frames
without any hacks as has been done in the gdb case. Provided patch sets proper
CFI unwind information for `__restore_rt'.

`__restore_rt' CFI is fixed only for x86_64 (tested on Fedora Core
kernel-xen-2.6.18-1.2849.fc6.x86_64) as on i386 (kernel-2.6.18-1.2747.el5.i686)

is in use VDSOed `__kernel_sigreturn' instead (with proper CFI already).
Still i386 should get fixed a similiar way but I do not have an easy testcase.

Currently gdb identifies signal frames using strcmp ("__restore_rt", ...)
(`amd64_linux_sigtramp_p') and gdb has also hardcoded arch-dependent unwind
info (register locations) for the signal frames.

I was told gcc is using `MD_FALLBACK_FRAME_STATE_FOR' (for exceptions
unwinding) but it is not suitable for ptrace(2)ing remote debuggers.

Testcase in glibc is not provided as the whole DWARF/CFI unwinding requires
framework IMO out of the glibc testcases' scope.


Testcase provided in libunwind testsuite forked in the Frysk repository:
	CVSROOT=:pserver:anoncvs.com:/cvs/frysk
	CVS Repository=frysk-imports/libunwind
	testcase: tests/run-ptrace-stepper
	Requires patch: http://sourceware.org/bugzilla/show_bug.cgi?id=3590

Comment 6 Jan Kratochvil 2006-11-29 20:08:23 UTC
It should be upstream. I hope it should get imported to the RH snapshot.

http://sourceware.org/ml/libc-alpha/2006-11/msg00081.html
On Wed, 29 Nov 2006 20:54:50 +0100, Ulrich Drepper wrote:
> I've checked the patch in with one further little patch.