Bug 114987

Summary: Very poor task switching on average load
Product: [Retired] Red Hat Linux Reporter: Nicolas Barry <boozai>
Component: kernelAssignee: Ingo Molnar <mingo>
Status: CLOSED NEXTRELEASE QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 9CC: riel
Target Milestone: ---   
Target Release: ---   
Hardware: athlon   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-08-18 11:41:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Source code of test program
none
Watchdog script : detects lags of more than 2 seconds none

Description Nicolas Barry 2004-02-05 05:57:09 UTC
Description of problem:
We have a set of multithreaded processes running on Redhat 9 machines.
When there is some load, it seems that only a few threads inside
certain processes get to be executed in a fair way. Certain threads,
in particular watchdog threads, that sleep most of the time, timeout
the entire system.
This problem does not occur on a stock 2.4.x kernel, we tried the
latest development 2.4 kernel, and couldn't reproduce the problem.

I tested this problem on a variety of systems, one is an athlon single
processor, and pentium 3 single and dual processor.

Version-Release number of selected component (if applicable):
2.4.20-27.9 or any 2.6.1 through 2.6.2 kernels

How reproducible:
Very easily using the test program combined with the watchdog script.

Steps to Reproduce:
0. Compile cputest with gcc -pthread -D_REENTRANT -lm cputest.c -o cputest
1. Start the watchdog script in a shell
2. Start the test program: ./cputest 20 10000 1000 512 
3. Watch the output of the watchdog script
  
Actual results:
An output that looks like
>>>>>>> delta = 3 Tue Jan 27 15:31:22 PST 2004
>>>>>>> delta = 4 Tue Jan 27 15:31:27 PST 2004
>>>>>>> delta = 4 Tue Jan 27 15:31:34 PST 2004
>>>>>>> delta = 6 Tue Jan 27 15:31:54 PST 2004
>>>>>>> delta = 6 Tue Jan 27 15:32:04 PST 2004
>>>>>>> delta = 20 Tue Jan 27 15:32:13 PST 2004
meaning, in that case, that I got a 20 seconds "freeze" of the
watchdog script that ended at 15:32:13

Expected results:
There should be no freezes like this, regardless of the load, as the
watchdog script needs a very minimal number of cycles to execute every
second.

Additional info:

Comment 1 Nicolas Barry 2004-02-05 05:58:48 UTC
Created attachment 97484 [details]
Source code of test program

Comment 2 Dave Jones 2004-02-05 18:41:47 UTC
Can you attach the watchdog script too please?


Comment 3 Nicolas Barry 2004-02-05 20:05:27 UTC
Created attachment 97498 [details]
Watchdog script : detects lags of more than 2 seconds

Forgot to post the watchdog script earlier