+++ This bug was initially created as a clone of Bug #454404 +++ Description of problem: While trying to detach a multithreaded program to remain T (stopped) some tasks are left unstopped. Singlethreaded program is left T (stopped) reliably. Version-Release number of selected component (if applicable): FAIL RHEL-5 kernel-2.6.18-92.1.6.el5.x86_64 FAIL RHEL-5 kernel-2.6.18-92.1.6.el5.i686 FAIL RHEL-5 kernel-2.6.18-53.el5.s390x PASS RHEL-4 kernel-smp-2.6.9-67.0.20.EL.x86_64 How reproducible: With #define THREADS 3 or more in fact always. Steps to Reproduce: wget -O detach-stopped.c http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/detach-stopped.c?cvsroot=systemtap; gcc -o detach-stopped detach-stopped.c -Wall -ggdb2 -pthread -D_GNU_SOURCE; ./detach-stopped; echo $? Actual results: 1 Expected results: 0 Additional info: There is a DEBUG for the tasks state dump. It is a regression against RHEL-4.
This request was evaluated by Red Hat Product Management for inclusion, but this component is not scheduled to be updated in the current Red Hat Enterprise Linux release. If you would like this request to be reviewed for the next minor release, ask your support representative to set the next rhel-x.y flag to "?".
Unfortunately the previous automated notification about the non-inclusion of this request in Red Hat Enterprise Linux 5.3 used the wrong text template. It should have read: this request has been reviewed by Product Management and is not planned for inclusion in the current minor release of Red Hat Enterprise Linux. If you would like this request to be reviewed for the next minor release, ask your support representative to set the next rhel-x.y flag to "?" or raise an exception.
It is a regression against RHEL-4. It regresses issue 78487.
This bugzilla has Keywords: Regression. Since no regressions are allowed between releases, it is also being proposed as a blocker for this release. Please resolve ASAP.
Pushing to RHEL 5.4, as this problem was not fixed for 5.3 release due to lower priority compared to other issues.
Updating PM score.
Jan, Roland, Do we really want to fix this? This matches upstream. Otoh, we are going to remove this extra wakeup sooner or later (this is discussed on lkml right now), and rhel6 already differs here. I never understood rhel5's utrace code in details but at first glance everything is clear and this behaviour is intentional, ptrace_detach() has a huge comment before it clears SIGNAL_STOP_STOPPED. Confused.
(In reply to comment #16) > > Jan, Roland, > > Do we really want to fix this? This matches upstream. Yes, but my initial analysis was wrong. > Otoh, we are going to remove this extra wakeup sooner or later > (this is discussed on lkml right now), yes, this wrong wakeup can abort the group-stop, but this case is unlikely, while the test-case always fails. > I never understood rhel5's utrace code in details but at first > glance everything is clear and this behaviour is intentional, > ptrace_detach() has a huge comment before it clears > SIGNAL_STOP_STOPPED. No, I misread detach-stopped.c, there is something else. Still investigating...
Created attachment 476671 [details] [patch] fix ptrace(PTRACE_DETACH, SIGSTOP) Seems to fix the problem, but I'll try to think a bit more. The problem is, ptrace_detach()->ptrace_induce_signal() does utrace_inject_signal(action => UTRACE_ACTION_RESUME) and this means that "add SIGNAL_STOP_DEQUEUED" logic never works. I think it is safer to change utrace_get_signal() like this patch does, if we want to fix this bug.
[RHEL5 PATCH 1/1] bz456333: ptrace(PTRACE_DETACH, SIGSTOP) does not stop http://post-office.corp.redhat.com/archives/rhkernel-list/2011-February/msg00225.html
(In reply to comment #19) > [RHEL5 PATCH 1/1] bz456333: ptrace(PTRACE_DETACH, SIGSTOP) does not stop > http://post-office.corp.redhat.com/archives/rhkernel-list/2011-February/msg00225.html It was decided we do not want to fix this: - it is not a regression - even today's kernel still behaves this way