+++ This bug was initially created as a clone of Bug #267161 +++ Description of problem: I have a test program (step-hang) which I will attach. It does a PTRACE_SINGLESTEP to step into a signal handler, then kills the debugged child. Once every 100 or so times it does this, the debugger (not the child) becomes hung in a 100% cpu state and cannot be killed or even SIGSTOPed to make it stop eating the CPU. Version-Release number of selected component (if applicable): kernel-2.6.18-48.el5.x86_64 How reproducible: Reproduced on the main loop run (approx.) # 580000. Currently te testcase runs 1000000 cycles (~5 minutes). Steps to Reproduce: 1. gcc -o tracer-lockup-on-sighandler-kill tracer-lockup-on-sighandler-kill.c -Wall -ggdb2 2. ./tracer-lockup-on-sighandler-kill 3. echo $? Actual results: We have a weiner! The test_signalstep child pid 24289 is apparently hung. The bug has been reproduced! Accumulated output from test: INFO: test_signalstep pid 24290 status: stopped with signum 10 INFO: test_signalstep pid 24290 status: stopped with signum 5 INFO: test_signalstep pid 24290 status: stopped with signum 5 INFO: test_signalstep pid 24290 status: stopped with signum 5 INFO: test_signalstep pid 24290 status: stopped with signum 5 INFO: test_signalstep pid 24290 PC = 0x400bc8 DEF: STEP_INTO_HANDLER=1 INFO: calling kill_kid_dead(24290) INFO: returned from kill_kid_dead(24290) 1 Expected results: 0
Created attachment 208941 [details] Testcase.
I've been able to reproduce this bug on both the -53 and -54 kernels. It reproduces nicely in RHTS.
Roland is/will be working on this issue.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
in 2.6.18-62.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5
Confirmed test is passing on x86_64 with the -75 kernel. There are also passing results in RHTS on other arch's with slightly older kernels.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0314.html