Bug 141282 - nptl futex_wait fix
nptl futex_wait fix
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: Ingo Molnar
Brian Brock
http://post-office.corp.redhat.com/ar...
:
Depends On:
Blocks: 132991
  Show dependency treegraph
 
Reported: 2004-11-29 21:00 EST by Marty Wesley
Modified: 2010-10-21 22:43 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-05-18 09:28:44 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Bugzilla 166292 None None None Never

  None (edit)
Description Marty Wesley 2004-11-29 21:00:33 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5)
Gecko/20041111 Firefox/1.0

Description of problem:
Post from Jakub about this upstream patch:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc1/2.6.10-rc1-mm5/broken-out/futex_wait-fix.patch

Both 2.4.x and 2.6.x futex.c is broken (racy).
Unfortunately Jamie Lokier does not acknowledge that (yet), though
he promissed he'll reply next day and has not done that in more than a
week
already.
For 2.4.x I believe it is easily fixable, like with the patch below
(untested though).  That is because we hold a spinlock accross the uaccess
anyway (by pinning the page first, then taking the lock, uaccessing,
etc.).
For 2.6.x the fix is harder, but I have posted some ideas to lkml.

Here is a mail I sent in September:

On Tue, Sep 07, 2004 at 08:04:55AM +0200, Ingo Molnar wrote:
> 
> Seto san, please find Jakub's reply below. He too agrees with your
> analysis and patch. Please send it to Andrew for upstream inclusion, i
> think it should get into 2.6.9.

Well, I think it would be cheaper to just move the __queue_me call down,
as in (untested RHEL3 patch below).
The 2.6.9 patch will likely be quite different though.

--- linux-2.4.21-20.EL/kernel/futex.c	2004-08-18 20:29:54.000000000 -0400
+++ linux-2.4.21-20.EL/kernel/futex.c	2004-09-07 03:33:34.035447554 -0400
@@ -346,23 +346,26 @@ static inline int futex_wait(unsigned lo
 		unlock_futex_mm();
 		return -EFAULT;
 	}
-	__queue_me(&q, page, uaddr, offset, -1, NULL);
 
 	/*
 	 * Page is pinned, but may no longer be in this address space.
 	 * It cannot schedule, so we access it with the spinlock held.
 	 */
-	if (!access_ok(VERIFY_READ, uaddr, 4))
-		goto out_fault;
+	if (!access_ok(VERIFY_READ, uaddr, 4)) {
+		ret = -EFAULT;
+		goto out_unlock;
+	}
 	kaddr = kmap_atomic(page, KM_USER0);
 	curval = *(int*)(kaddr + offset);
 	kunmap_atomic(kaddr, KM_USER0);
 
 	if (curval != val) {
-		unlock_futex_mm();
 		ret = -EWOULDBLOCK;
-		goto out;
+		goto out_unlock;
 	}
+
+	__queue_me(&q, page, uaddr, offset, -1, NULL);
+
 	/*
 	 * The get_user() above might fault and schedule so we
 	 * cannot just set TASK_INTERRUPTIBLE state when queueing
@@ -372,10 +375,9 @@ static inline int futex_wait(unsigned lo
 	 */
 	add_wait_queue(&q.waiters, &wait);
 	set_current_state(TASK_INTERRUPTIBLE);
-	if (!list_empty(&q.list)) {
-		unlock_futex_mm();
-		time = schedule_timeout(time);
-	}
+	BUG_ON (list_empty(&q.list));
+	unlock_futex_mm();
+	time = schedule_timeout(time);
 	set_current_state(TASK_RUNNING);
 	/*
 	 * NOTE: we don't remove ourselves from the waitqueue because
@@ -395,10 +397,10 @@ out:
 
 	return ret;
 
-out_fault:
+out_unlock:
 	unlock_futex_mm();
-	ret = -EFAULT;
-	goto out;
+	put_page(q.page);
+	return ret;
 }
 
 long do_futex(unsigned long uaddr, int op, int val, unsigned long
timeout,


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
 

Additional info:
Comment 2 Ernie Petrides 2005-01-29 00:54:48 EST
A fix for this problem has just been committed to the RHEL3 U5
patch pool this evening (in kernel version 2.4.21-27.10.EL).
Comment 6 Tim Powers 2005-05-18 09:28:44 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-294.html
Comment 7 mike 2005-09-12 12:41:07 EDT
pls see Bug 166292
patch does not seem to fix issues, (in 2.4.21-32.0.1.ELsmp)

Note You need to log in before you can comment on or make changes to this bug.