Bug 248671

Summary: [RHEL4] pthreads deadlock with setuid
Product: Red Hat Enterprise Linux 4 Reporter: Mikhail Kruk <mkruk>
Component: glibcAssignee: Jeff Law <law>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: low    
Version: 4.4CC: drepper, fweimer, jakub
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-01-16 17:58:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
sample program
none
modified version of reproducer referred to above none

Description Mikhail Kruk 2007-07-18 03:59:55 UTC
Description of problem:

If a thread makes a setuid() call and then terminates and another threads
attempts to call pthread_join() on it a deadlock occurs.

Version-Release number of selected component (if applicable):

Not sure whether this is glibc or kernel, I tested on these two combinations:
2.6.9-42.0.10.ELsmp and glibc-2.3.4-2.25
2.6.9-55.ELsmp, glibc-2.3.2-95.30

This also seems to affect RedHat 5 but I don't have the exact system configs
right now (can verify if needed)

How reproducible:

I have a small (53 line) demo program.

Steps to Reproduce:
1. gcc -pthreads ths.c
2. ./a.out
3. profit!!!
  
Actual results:
launched threads, iter 0

Expected results:
launched threads, iter 0
joined threads
launched threads, iter 1
joined threads
...
many times

Additional info:
I'm not sure why would anybody call setuid() in a thread, but hey -- it's a bug.

Comment 1 Mikhail Kruk 2007-07-18 03:59:55 UTC
Created attachment 159495 [details]
sample program

Comment 2 Ernie Petrides 2007-07-19 23:40:44 UTC
I've reproduced this problem with stock RHEL4.5 and also with a recent
interim build (2.6.9-55.14.EL) RHEL4.6-under-development kernel (since
there was a futex fix applied earlier in U6).

I think the problem lies in the thread clean-up handling in glibc, and
thus I'm reassigning this BZ appropriately.  The setuid() calls in each
thread all succeed (if run as root) or all fail (if run as non-root),
but it seems that one (or sometimes two or three) thread(s) never fully
exit.  They complete execution of the thread function, but the pthread_kill()
function from the parent still finds them (i.e., the call returns 0 instead
of ESRCH) and a subsequent pthread_join() would wait indefinitely.  Note that
the call to pthread_kill() is made with a signal arg of 0, which does not
actually kill the thread (which is intentional).

I suspect that a pthread_kill() racing with a setuid() might be at the heart
of this problem, only because changing the setuid() to several other syscalls
makes the problem unreproducible.

I will attach a modified version of the reproducer, which contains some added
debugging logic.


Comment 3 Ernie Petrides 2007-07-19 23:41:59 UTC
Created attachment 159623 [details]
modified version of reproducer referred to above

Comment 4 Jakub Jelinek 2007-07-20 07:49:18 UTC
Found what sounds like a dup of this, BZ#3270.

Comment 5 Ernie Petrides 2007-07-20 21:09:14 UTC
I forgot to mention that the thread(s) that get stuck (following execution
of their setuid() syscall) are in a futex() syscall for a FUTEX_WAIT op.
They are interruptible, i.e., a signal will effectively kill all threads
of the process.

Jakub, it seems that a couple of digits are missing from the BZ listed
in your prior comment.


Comment 6 Jakub Jelinek 2007-07-20 21:22:06 UTC
No, BZ#3270 in sourceware bugzilla, see External Bugzilla References.

Comment 7 Ernie Petrides 2007-07-20 21:29:11 UTC
Ah, got it, thanks.

In case anyone else has trouble finding the External Bugzilla References
section below :-),  you can use this link:

  http://sources.redhat.com/bugzilla/show_bug.cgi?id=3270

Comment 9 Jeff Law 2012-01-16 17:58:24 UTC
We are not planning to fix this problem for Red Hat Enterprise Linux 4

There have been numerous fixes for setxid & pthread_join in Red Hat Enterprise Linux 5 & 6.  However, #769852 is still open for Red Hat Enterprise Linux 5 (race condition can lead to hang in pthread_join after thread has called setuid).  I expect this will be fixed in Red Hat Enterprise Linux 5.9.