Bug 593047 - pthread cleanup handler is not invoked during thread cancellation
Summary: pthread cleanup handler is not invoked during thread cancellation
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: glibc
Version: 5.2
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Andreas Schwab
QA Contact: qe-baseos-tools
URL:
Whiteboard:
Keywords: ZStream
Depends On:
Blocks: 594617
TreeView+ depends on / blocked
 
Reported: 2010-05-17 17:37 UTC by Alan Matsuoka
Modified: 2018-10-27 16:07 UTC (History)
7 users (show)

(edit)
Under certain conditions, cancellation of a thread did not invoke a cleanup handler. This update adds more complete information to the unwind library for glibc, thus, when canceling a thread, a cleanup handler is invoked before the thread is terminated under all circumstances.
Clone Of:
(edit)
Last Closed: 2011-01-14 00:04:56 UTC


Attachments (Terms of Use)
thread_clnl_call.C (2.33 KB, text/plain)
2010-05-17 17:39 UTC, Alan Matsuoka
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:0109 normal SHIPPED_LIVE glibc bug fix and enhancement update 2011-01-12 17:29:09 UTC

Description Alan Matsuoka 2010-05-17 17:37:22 UTC
Description of problem:
pthread_cleanup_handler is not getting invoked on RHEL5 in certain case. This issue was observed when a thread which is making call to clnt_call() receives a cancellation request.

How reproducible:
Always

Steps to Reproduce:
To reproduce this we have created a sample program in which we are calling ctnl_call (RPC call) in loop and in the same time we are canceling the thread.

1. On RHEL5 (x86_64), compile the attached program as
"g++ -g --save-temps -Wall -Wno-unused -Wno-long-long -Wwrite-strings -fmessage-length=0 -D__EXTERN_C__ -DLINUX -Dregister= -D_POSIX_PTHREAD_SEMANTICS -m32 -o thread_exp thread_clnl_call.C -lnsl -lpthread  -lcrypt -lrt"

2. Execute this program ("thread_exp") on the same machine but before executing this program make sure that NFS daemon is running.


Actual results:
Starting child thread.
Canceling child thread.
Child thread terminated.


Expected results:
Starting child thread.
Canceling child thread.
Cleanup handler called.
Child thread terminated.


Additional info:
The reproducer is from
http://sourceware.org/ml/glibc-bugs/2007-03/msg00006.html

This bug is only seen when running in 32 bit mode. The reproducer works in 64 bit mode.

I've been stepping through the code and it gets into the stack unwinding portion of the cancellation code and to the part where it's supposed to call the cleanup routine. For some reason the cleanup routine does not get called.

Comment 1 Alan Matsuoka 2010-05-17 17:39:45 UTC
Created attachment 414623 [details]
thread_clnl_call.C

Comment 2 Andreas Schwab 2010-05-18 13:43:58 UTC
Most of the unwinding stuff comes from libgcc.

Comment 3 Jakub Jelinek 2010-05-18 13:56:20 UTC
That's true, but in this case it is far more important whether whatever function blocks in clnt_call and all its callers up to clnt_call itself are compiled with -fexceptions or -fasynchronous-unwind-tables.

On x86_64 the latter is the default, but on i?86 it is not.
In RHEL6 glibc is built with -fasynchronous-unwind-tables in CFLAGS, see #216518,
but in RHEL5 and earlier only selected object files.  Guess the set might need to be extended...

Comment 4 Andreas Schwab 2010-05-18 14:30:49 UTC
There are a lot of functions in the call chain....

#0  0x00f07402 in __kernel_vsyscall ()
#1  0x0065d133 in poll () from /lib/libc.so.6
#2  0x0068b32f in readtcp () from /lib/libc.so.6
#3  0x0069147d in xdrrec_getbytes () from /lib/libc.so.6
#4  0x0069165b in xdrrec_getlong () from /lib/libc.so.6
#5  0x006903d1 in xdr_u_long_internal () from /lib/libc.so.6
#6  0x0068d5ec in xdr_replymsg_internal () from /lib/libc.so.6
#7  0x0068b0ed in clnttcp_call () from /lib/libc.so.6

Guess the only reliable fix is to add -fasynchronous-unwind-tables throughout.

Comment 7 Jakub Jelinek 2010-05-19 12:25:14 UTC
Note that I'm not 100% sure whether just enabling -fa-u-t in the spec file for everything is safe, I vaguely remember there had to be Makefile changes to make that happen (some object files intentionally don't want to have unwind info, e.g. to serve as a stop frame for unwinding etc., or I think -fa-u-t breaks the ctor/dtor csu splitting).  So perhaps just enabling it in sunrpc/ would be safer for RHEL5.

Comment 13 Martin Prpič 2010-12-02 11:19:02 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Under certain conditions, cancellation of a thread did not invoke a cleanup handler. This update adds more complete information to the unwind library for glibc, thus, when canceling a thread, a cleanup handler is invoked before the thread is terminated under all circumstances.

Comment 15 errata-xmlrpc 2011-01-14 00:04:56 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0109.html


Note You need to log in before you can comment on or make changes to this bug.