Bug 248671 - [RHEL4] pthreads deadlock with setuid
[RHEL4] pthreads deadlock with setuid
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: glibc (Show other bugs)
4.4
i686 Linux
low Severity medium
: ---
: ---
Assigned To: Jeff Law
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-07-17 23:59 EDT by Mikhail Kruk
Modified: 2012-01-16 12:58 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-01-16 12:58:24 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
sample program (787 bytes, text/plain)
2007-07-17 23:59 EDT, Mikhail Kruk
no flags Details
modified version of reproducer referred to above (1.05 KB, text/plain)
2007-07-19 19:41 EDT, Ernie Petrides
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Sourceware 3270 None None None Never

  None (edit)
Description Mikhail Kruk 2007-07-17 23:59:55 EDT
Description of problem:

If a thread makes a setuid() call and then terminates and another threads
attempts to call pthread_join() on it a deadlock occurs.

Version-Release number of selected component (if applicable):

Not sure whether this is glibc or kernel, I tested on these two combinations:
2.6.9-42.0.10.ELsmp and glibc-2.3.4-2.25
2.6.9-55.ELsmp, glibc-2.3.2-95.30

This also seems to affect RedHat 5 but I don't have the exact system configs
right now (can verify if needed)

How reproducible:

I have a small (53 line) demo program.

Steps to Reproduce:
1. gcc -pthreads ths.c
2. ./a.out
3. profit!!!
  
Actual results:
launched threads, iter 0

Expected results:
launched threads, iter 0
joined threads
launched threads, iter 1
joined threads
...
many times

Additional info:
I'm not sure why would anybody call setuid() in a thread, but hey -- it's a bug.
Comment 1 Mikhail Kruk 2007-07-17 23:59:55 EDT
Created attachment 159495 [details]
sample program
Comment 2 Ernie Petrides 2007-07-19 19:40:44 EDT
I've reproduced this problem with stock RHEL4.5 and also with a recent
interim build (2.6.9-55.14.EL) RHEL4.6-under-development kernel (since
there was a futex fix applied earlier in U6).

I think the problem lies in the thread clean-up handling in glibc, and
thus I'm reassigning this BZ appropriately.  The setuid() calls in each
thread all succeed (if run as root) or all fail (if run as non-root),
but it seems that one (or sometimes two or three) thread(s) never fully
exit.  They complete execution of the thread function, but the pthread_kill()
function from the parent still finds them (i.e., the call returns 0 instead
of ESRCH) and a subsequent pthread_join() would wait indefinitely.  Note that
the call to pthread_kill() is made with a signal arg of 0, which does not
actually kill the thread (which is intentional).

I suspect that a pthread_kill() racing with a setuid() might be at the heart
of this problem, only because changing the setuid() to several other syscalls
makes the problem unreproducible.

I will attach a modified version of the reproducer, which contains some added
debugging logic.
Comment 3 Ernie Petrides 2007-07-19 19:41:59 EDT
Created attachment 159623 [details]
modified version of reproducer referred to above
Comment 4 Jakub Jelinek 2007-07-20 03:49:18 EDT
Found what sounds like a dup of this, BZ#3270.
Comment 5 Ernie Petrides 2007-07-20 17:09:14 EDT
I forgot to mention that the thread(s) that get stuck (following execution
of their setuid() syscall) are in a futex() syscall for a FUTEX_WAIT op.
They are interruptible, i.e., a signal will effectively kill all threads
of the process.

Jakub, it seems that a couple of digits are missing from the BZ listed
in your prior comment.
Comment 6 Jakub Jelinek 2007-07-20 17:22:06 EDT
No, BZ#3270 in sourceware bugzilla, see External Bugzilla References.
Comment 7 Ernie Petrides 2007-07-20 17:29:11 EDT
Ah, got it, thanks.

In case anyone else has trouble finding the External Bugzilla References
section below :-),  you can use this link:

  http://sources.redhat.com/bugzilla/show_bug.cgi?id=3270
Comment 9 Jeff Law 2012-01-16 12:58:24 EST
We are not planning to fix this problem for Red Hat Enterprise Linux 4

There have been numerous fixes for setxid & pthread_join in Red Hat Enterprise Linux 5 & 6.  However, #769852 is still open for Red Hat Enterprise Linux 5 (race condition can lead to hang in pthread_join after thread has called setuid).  I expect this will be fixed in Red Hat Enterprise Linux 5.9.

Note You need to log in before you can comment on or make changes to this bug.