Description of problem: SMP system may freeze, typically by frequent fork/exit. This is a known problem which is already fixed in 2.5.32. Patch will be attached. This happens because there are two different order in kernel to acquire both runqueue lock and tasklist_lock. Thus one cpu may try to lock runqueue after tasklist_lock and the other may try tasklist_lock after runqueue lock. Version-Release number of selected component (if applicable): 2.4.21-3.EL How reproducible: difficult to reproduce. Steps to Reproduce: 1. Running the following program on machine with many CPUs. - - - - - - - - - - - - - - - - - - - - #include <unistd.h> #define NPROC 128 int main(int argc, char** argv) { int i, status; for(i=1; i<NPROC; i++) { fork(); } while (1) { if (fork()==0) { system("exit"); exit(0); } else wait(&status); } } - - - - - - - - - - - - - - - - - - - - 2. 3. Actual results: Expected results: Additional info: IA-64 Linux kernel may call wrap_mmu_context during context_switch to find unused context number. wrap_mmu_context holds tasklist_lock while searching through tasklist. On the other hand, some exit related functions grab tasklist_lock to remove task from the list and then try to hold runqueue lock to wake up parent. As context_switch is called with runqueue lock held, there are two different order to acquire tasklist_lock and runqueue lock. This can cause dead lock. Example: CPU#0: schedule() -> spin_lock_irq(&rq->lock) -> context_switch() -> wrap_mmu_context() -> read_lock(&tasklist_lock) CPU#1: sys_wait4() -> write_lock(&tasklist_lock) -> do_notify_parent() -> wake_up_parent() -> try_to_wake_up() -> spin_lock_irq(&parent_rq->lock) The problem and fix was discussed in linux-kernel list in July, 2002. http://marc.theaimsgroup.com/?l=linux-kernel&m=102629373819157&w=2
Created attachment 94811 [details] O(1) scheduler patch for ia64 The cause of the problem is that context_switch is done with runqueue lock held. The lock is held only to avoid the running task being stolen by other cpus. O(1) scheduler already introduced switch_lock in task_struct for the purpose. The patch use switch_lock to avoid tasks being stolen during context switch. In detail, the patch implements arch-specific macros for ia64: - prepare_arch_switch() - finish_arch_switch() - task_running() task_running() formerly just compared runqueue->curr with the task. With this patch, on IA-64, task_running() checks if switch_lock is locked as well as runqueue->curr. On other architecure, the behaviour is unchanged. prepare_arch_switch() and finish_arch_switch() are also redefined for IA-64. By default, prepare_arch_switch() does nothing. On IA-64, prepare_arch_switch() is changed to lock switch_lock and unlock runqueue->lock. finish_arch_switch() unlocks runqueue->lock on architecture other than IA-64. On IA-64, prepare_arch_switch() is changed to unlock switch_lock.
Though this bug was closed as CURRENTRELEASE, 2.4.21-20.EL (RHEL3 U3) does not include the patch above.
The patch in comment #1 has just been committed to the RHEL3 U4 patch pool this evening (in kernel version 2.4.21-20.8.EL).
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-550.html