Bug 105909 - O(1) scheduler deadlock
O(1) scheduler deadlock
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
ia64 Linux
medium Severity high
: ---
: ---
Assigned To: Larry Woodman
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-09-29 09:13 EDT by Jun'ichi NOMURA
Modified: 2007-11-30 17:06 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-12-20 15:54:44 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
O(1) scheduler patch for ia64 (590 bytes, patch)
2003-09-29 09:16 EDT, Jun'ichi NOMURA
no flags Details | Diff

  None (edit)
Description Jun'ichi NOMURA 2003-09-29 09:13:21 EDT
Description of problem:

SMP system may freeze, typically by frequent fork/exit.

This is a known problem which is already fixed in 2.5.32.
Patch will be attached.

This happens because there are two different order in kernel to acquire
both runqueue lock and tasklist_lock.
Thus one cpu may try to lock runqueue after tasklist_lock
and the other may try tasklist_lock after runqueue lock.


Version-Release number of selected component (if applicable):

2.4.21-3.EL


How reproducible:

difficult to reproduce.

Steps to Reproduce:
1. Running the following program on machine with many CPUs.
- - - - - - - - - - - - - - - - - - - - 
#include <unistd.h>

#define NPROC 128

int
main(int argc, char** argv)
{
        int i, status;

        for(i=1; i<NPROC; i++) {
                fork();
        }
        while (1) {
                if (fork()==0) {
                       system("exit");
                       exit(0);
                }
                else
                       wait(&status);
        }
}
- - - - - - - - - - - - - - - - - - - - 

2.
3.
    
Actual results:


Expected results:


Additional info:

IA-64 Linux kernel may call wrap_mmu_context during context_switch
to find unused context number.
wrap_mmu_context holds tasklist_lock while searching through tasklist.

On the other hand, some exit related functions grab tasklist_lock
to remove task from the list and then try to hold runqueue lock to wake
up parent.

As context_switch is called with runqueue lock held, there are two
different order to acquire tasklist_lock and runqueue lock.
This can cause dead lock.

Example:
CPU#0:
schedule()
   -> spin_lock_irq(&rq->lock)
   -> context_switch()
      -> wrap_mmu_context()
         -> read_lock(&tasklist_lock)

CPU#1:
sys_wait4()
   -> write_lock(&tasklist_lock)
   -> do_notify_parent()
      -> wake_up_parent()
         -> try_to_wake_up()
            -> spin_lock_irq(&parent_rq->lock)

The problem and fix was discussed in linux-kernel list in July, 2002.
http://marc.theaimsgroup.com/?l=linux-kernel&m=102629373819157&w=2
Comment 1 Jun'ichi NOMURA 2003-09-29 09:16:10 EDT
Created attachment 94811 [details]
O(1) scheduler patch for ia64

The cause of the problem is that context_switch is done with runqueue
lock held. The lock is held only to avoid the running task being stolen
by other cpus.

O(1) scheduler already introduced switch_lock in task_struct for the purpose.

The patch use switch_lock to avoid tasks being stolen during context
switch.

In detail, the patch implements arch-specific macros for ia64:
   - prepare_arch_switch()
   - finish_arch_switch()
   - task_running()

task_running() formerly just compared runqueue->curr with the task.
With this patch, on IA-64, task_running() checks if switch_lock is locked as
well as runqueue->curr. On other architecure, the behaviour is unchanged.

prepare_arch_switch() and finish_arch_switch() are also redefined for IA-64.
By default, prepare_arch_switch() does nothing.
On IA-64, prepare_arch_switch() is changed to lock switch_lock and unlock
runqueue->lock.
finish_arch_switch() unlocks runqueue->lock on architecture other than IA-64.
On IA-64, prepare_arch_switch() is changed to unlock switch_lock.
Comment 2 Jun'ichi NOMURA 2004-09-06 03:05:29 EDT
Though this bug was closed as CURRENTRELEASE,
2.4.21-20.EL (RHEL3 U3) does not include the patch above.
Comment 4 Ernie Petrides 2004-09-20 02:50:57 EDT
The patch in comment #1 has just been committed to the RHEL3 U4
patch pool this evening (in kernel version 2.4.21-20.8.EL).
Comment 5 John Flanagan 2004-12-20 15:54:44 EST
An errata has been issued which should help the problem 
described in this bug report. This report is therefore being 
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files, 
please follow the link below. You may reopen this bug report 
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2004-550.html

Note You need to log in before you can comment on or make changes to this bug.