Bug 563073

Summary: kernel: race in ptrace
Product: [Other] Security Response Reporter: Eugene Teo (Security Response) <eteo>
Component: vulnerabilityAssignee: Red Hat Product Security <security-response-team>
Status: CLOSED NOTABUG QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: unspecifiedCC: arozansk, bhu, cebbert, davej, dhoward, jolsa, jpirko, kmcmartin, lgoncalv, lwang, onestero, plyons, roland, vgoyal, vmayatsk, williams
Target Milestone: ---Keywords: Security
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-02-15 06:05:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 563074, 563075, 563076, 563077, 563078, 563079    
Bug Blocks:    

Description Eugene Teo (Security Response) 2010-02-09 05:34:26 UTC
Description of problem:
A race in ptrace was pointed to us by a fellow Google engineer, Tavis Ormandy.  The race involves interaction between a tracer, a tracee and an antagonist.  The tracer is tracing the tracee with PTRACE_SYSCALL and waits on the tracee.  In the mean time, an antagonist blasts the tracee with SIGCONTs.

The observed issue is that sometimes when the tracer attempts to continue the tracee with PTRACE_SYSCALL, it gets a return value of -ESRCH, indicating that the tracee is already running (or not being traced).  It turns out that a SIGCONT wakes up the tracee in kernel mode, and for a moment the tracee's state is TASK_RUNNING then in ptrace_stop we hit the condition where the tracee is found to be running (and thus not traced).  If the syscall is repeated, the second time it usually succeeds (because by that time, the tracee has been put into TASK_TRACED).

Below is a quick and dirty fix for the one instance that I did figure out.  Note that this doesn't completely close the race on 2.6.33-rc6.  But on 2.6.26 it appears to be sufficient.  I suspect there are other code paths with similar issues:

    Fix a race in ptrace.
    
    Race description:
    
    The traced process is running for a small duration
    of time between when it is sent a SIGCONT and when it realizes that it
    needs to be asleep in order to get traced.  If during this time the 
    tracer calls ptrace with PTRACE_SYSCALL, it recieves an errno value of 
    -ESRCH.
    
    Solution:
    
    We add a new bit to the ptrace field of task_struct.  We call this 
    PT_WAKING. When the process is being awoken for a SIGCONT signal, we set 
    this bit before state changes to TASK_RUNNING.  When the process is about
    to go to sleep, we reset this bit after we change the state to TASK_TRACED.

Signed-off-by: Salman Qazi <sqazi>

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 56f2d63..6c6771a 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -67,8 +67,9 @@
 #define PT_TRACE_EXEC	0x00000080
 #define PT_TRACE_VFORK_DONE	0x00000100
 #define PT_TRACE_EXIT	0x00000200
+#define PT_WAKING	0x00000400
 
-#define PT_TRACE_MASK	0x000003f4
+#define PT_TRACE_MASK	0x000007f4
 
 /* single stepping state bits (used on ARM and PA-RISC) */
 #define PT_SINGLESTEP_BIT	31
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 23bd09c..32157f8 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -104,7 +104,8 @@ int ptrace_check_attach(struct task_struct *child, int kill)
 		spin_lock_irq(&child->sighand->siglock);
 		if (task_is_stopped(child))
 			child->state = TASK_TRACED;
-		else if (!task_is_traced(child) && !kill)
+		else if (!task_is_traced(child) && !kill &&
+				(!(child->ptrace & PT_WAKING)))
 			ret = -ESRCH;
 		spin_unlock_irq(&child->sighand->siglock);
 	}
diff --git a/kernel/signal.c b/kernel/signal.c
index 934ae5e..095507e 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -697,6 +697,10 @@ static int prepare_signal(int sig, struct task_struct *p, int from_ancestor_ns)
 		 * and wake all threads.
 		 */
 		rm_from_queue(SIG_KERNEL_STOP_MASK, &signal->shared_pending);
+		if (p->ptrace & PT_PTRACED) {
+			p->ptrace |= PT_WAKING;
+			mb();
+		}
 		t = p;
 		do {
 			unsigned int state;
@@ -1626,6 +1630,10 @@ static void ptrace_stop(int exit_code, int clear_code, siginfo_t *info)
 
 	/* Let the debugger run.  */
 	__set_current_state(TASK_TRACED);
+	if (current->ptrace & PT_PTRACED) {
+		mb();
+		current->ptrace &= ~PT_WAKING;
+	}
 	spin_unlock_irq(&current->sighand->siglock);
 	read_lock(&tasklist_lock);
 	if (may_ptrace_stop()) {

Reference:
http://lkml.org/lkml/2010/2/8/327