File: linux-2.6.9-sys_times-Fix-system-unresponsiveness-during-many-c.patch diff -up linux-2.6.9/kernel/sys.c.orig linux-2.6.9/kernel/sys.c --- linux-2.6.9/kernel/sys.c.orig 2008-04-04 10:42:38.000000000 +0200 +++ linux-2.6.9/kernel/sys.c 2008-05-29 10:56:42.000000000 +0200 @@ -975,7 +975,8 @@ asmlinkage long sys_times(struct tms __u struct task_struct *t; unsigned long utime, stime, cutime, cstime; - read_lock(&tasklist_lock); + spin_lock_irq(&tsk->proc_lock); + spin_lock_irq(&tsk->sighand->siglock); utime = tsk->signal->utime; stime = tsk->signal->stime; t = tsk; @@ -994,11 +995,10 @@ asmlinkage long sys_times(struct tms __u * To make sure we always see that pair updated atomically, * we take the siglock around fetching them. */ - spin_lock_irq(&tsk->sighand->siglock); cutime = tsk->signal->cutime; cstime = tsk->signal->cstime; spin_unlock_irq(&tsk->sighand->siglock); + spin_unlock_irq(&tsk->proc_lock); - read_unlock(&tasklist_lock); tmp.tms_utime = jiffies_to_clock_t(utime); tmp.tms_stime = jiffies_to_clock_t(stime); Peter Zijlstra noticed incorrect order of locks acquiring: "This one can actually cause trouble, you already enabled IRQs one line up, so _IF_ an interrupt happens while we still hold the proc_lock, and that interrupt also wants to acquire the lock, we deadlock!" Other notice was: "Hmm, other code doesn't seem to require proc_lock to be irq safe - so why was this done?"
This ia follow-on clean up patch for bug 435280, posted by Vitaly and integrated/merged in on -71.EL
Created attachment 308287 [details] The patch itself
Committed in 72.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2008-0665.html