Bug 151502
Summary: | pthread_cancel crashes (segv) on invalid thread ID | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Patrick Quaid <pquaid> |
Component: | glibc | Assignee: | Jakub Jelinek <jakub> |
Status: | CLOSED NOTABUG | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 3.0 | CC: | fweimer, jizhq76, M8R-7fin56, stsp2 |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-03-20 15:14:45 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Patrick Quaid
2005-03-18 18:38:31 UTC
See POSIX: http://www.opengroup.org/onlinepubs/009695399/functions/pthread_cancel.html ERRORS The pthread_cancel() function may fail if: [ESRCH] No thread could be found corresponding to that specified by the given thread ID. The important word here is `may'. In NPTL where thread ID is a pointer to thread's structure rather than a small index into an array of threads, making sure ESRCH is returned for all invalid values would mean quite a big overhead just to make broken programs happy. As the standard doesn't require this, NPTL just doesn't do that checking. It is similar to passing invalid FILE * argument to stdio functions, there the program will crash rather than give you an error. I don't seem to recall any descriptions in POSIX that explicitly describe a segfault as an allowable implementation, so I guess your argument is that a segfault is implicitly acceptable behavior for *any* POSIX function in place of *any* of the defined errors. OK, that's put a little strongly. It's certainly a quality of implementation issue, at the very least. AIX and Solaris, not to mention all earlier versions of Red Hat, seem to run the test program without surprises. The difference between this case and the stdio case is that pthread_t is entirely opaque. FILE is opaque too, but (crucially) FILE* is not. It's easy to initialize a FILE* to NULL and know whether or not it's valid; I know of no analogous way to use pthread_t portably. Passing garbage to pthread_cancel() is not exactly central to our design, but we ran across this innocently enough, in a case somewhat like the example program in which we had not yet initialized a thread before shutting down due to unrelated errors. In the current implementation, passing in 0 or other values outside the address space cause a segfault. But even passing in a "once valid" pthread_t can cause a segfault if the thread is long dead and joined: the original thread structure is allocated through malloc(), which calls mmap(), and then some time after the thread is dead, the memory is returned to the system with munmap(). Is that still possible? Is it still a good implementation? And if you start and kill off a thread, and then later that memory is reused for some other purpose, it certainly seems to me that you'll write to that memory area as long as it still passes that minimal "invalid" test noted above. Is it not valid to call pthread_cancel on a thread that's no longer running? How would you suggest that programs avoid that race? Is it still a good implementation? If a thread ends and a new thread is created, might the new thread reuse the same thread structure? In that case, I guess you would inadvertently cancel the wrong thread, again in the case in which the cancel raced with the exit. I agree that from a standard lawyer's point of view, the implementation may be compliant. But the implementation is still lousy. According the description of the man-page of pthread_kill(): #http://www.opengroup.org/onlinepubs/009695399/functions/pthread_kill.html ------------------------------------------------------------------------------- The pthread_kill() function shall fail if: [ESRCH] No thread could be found corresponding to that specified by the given thread ID. [EINVAL] The value of the sig argument is an invalid or unsupported signal number. The pthread_kill() function shall not return an error code of [EINTR]. ------------------------------------------------------------------------------- if the thread-id passed to pthread_kill() is invalid, pthread_kill() will crash as same as pthread_cancel() [segment fault]. Is there anyone can give some advise? thx a lot. Frankly, this interpretation by Jakub Jelinek is absolutely ludicrous. It's an excuse for lazy programming, founded on a (convenient?) inability to read the specification correctly! Jakub, which part of this statement from the specification don't you understand: "Upon successful completion, the function shall return a value of zero. Otherwise, the function shall return an error number." This leaves NO room for your interpretation, Jakub! The function is given two choices: it SHALL RETURN zero, or it SHALL return an error number. Unfortunately for your twisted logic, Jakub, crashing means the function doesn't return anything (because it never returns)! And (coming before the section you quote, which was "The pthread_cancel() function may fail if"), this previous statement makes clear the meaning of "may fail if". It is limiting the cases that may fail, and the error codes that may be returned, not giving you license to do whatever you feel like. If you're intellectually honest, Jakub, I suggest you re-open this bug. It is a very reasonable case that code wants to cancel a thread if it hasn't already exited, and your current interpretation leaves no safe way to accomplish this. There is a proposal pending to resolve this: http://austingroupbugs.net/view.php?id=599 For PTHREAD_NULL these function will (likely) return ESRCH just as you desire. |