Bug 128082

Summary: smeop() problem occurring with multi-threaded application
Product: Red Hat Enterprise Linux 3 Reporter: Alexandre Oliva <aoliva>
Component: kernelAssignee: Ernie Petrides <petrides>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: linux-sid, lwang, nhorman, peterm, petrides, riel, tao, tburke
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
URL: http://www.ussg.iu.edu/hypermail/linux/kernel/0111.0/1462.html
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-09-09 00:27:59 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 133089    
Attachments:
Description Flags
Test program that exposes the problem, copied from the given URL none

Description Alexandre Oliva 2004-07-17 07:44:06 UTC
Semaphore posts are only supposed to be undone when the entire
processe exists, not when the posting thread exists.  I'll attach a
program that demonstrates the problem.

FWIW, the test program works on FC2, since kenrel 2.6 has a variant of
the semundo patch in the URL above.  The patch that made it to 2.6
uses the CLONE_SYSVSEM flag to tell the kernel to use the same
semaphore group for the cloned process, and NPTL passes this flag to
clone.  Linuxthreads doesn't, but if I change clone there to pass the
same flag, it works with Linuxthreads as well.

On RHEL3, the flag is passed to the clone syscall, but it has no
effect, because the kernel doesn't have the semundo patch.  This might
be because the patch actually changes a pointer in task_struct,
breaking the ABI.  Any chance we could backport such a patch anyway?

Comment 2 Alexandre Oliva 2004-07-17 07:53:12 UTC
Created attachment 101994 [details]
Test program that exposes the problem, copied from the given URL

the test program's output SHOULD look like:

Waiter, pid = 11490
Poster, pid = 11490, posting
Poster posted
Poster exiting
Waiter waiting, pid = 11490
Waiter done waiting

The Incorrect output on RHEL3U2 with NPTL is:

Waiter, pid = 712
Poster, pid = 712, posting
Poster posted
Poster exiting
Waiter waiting, pid = 712

with Linuxthreads, the only difference is that different threads print
different pids.

Comment 3 Ernie Petrides 2004-07-22 22:51:16 UTC
Other bugzilla reports on this problem are 104575 and 107784.  I'm
reassigning this to myself in case we might have an opportunity to
fix this with a future relaxation of our Kernel Module Interface
compatibility policy.  (Correcting this problem requires a change
to the "task_struct", which would break KMI compat in a major way.)


Comment 4 Ernie Petrides 2004-07-22 22:53:48 UTC
*** Bug 104575 has been marked as a duplicate of this bug. ***

Comment 5 Ernie Petrides 2004-07-22 22:54:30 UTC
*** Bug 107784 has been marked as a duplicate of this bug. ***

Comment 6 Ernie Petrides 2005-09-09 00:27:59 UTC
This will never be fixed in RHEL3 due to KMI (kABI) compatibility constraints.