Description of problem: We have a set of multithreaded processes running on Redhat 9 machines. When there is some load, it seems that only a few threads inside certain processes get to be executed in a fair way. Certain threads, in particular watchdog threads, that sleep most of the time, timeout the entire system. This problem does not occur on a stock 2.4.x kernel, we tried the latest development 2.4 kernel, and couldn't reproduce the problem. I tested this problem on a variety of systems, one is an athlon single processor, and pentium 3 single and dual processor. Version-Release number of selected component (if applicable): 2.4.20-27.9 or any 2.6.1 through 2.6.2 kernels How reproducible: Very easily using the test program combined with the watchdog script. Steps to Reproduce: 0. Compile cputest with gcc -pthread -D_REENTRANT -lm cputest.c -o cputest 1. Start the watchdog script in a shell 2. Start the test program: ./cputest 20 10000 1000 512 3. Watch the output of the watchdog script Actual results: An output that looks like >>>>>>> delta = 3 Tue Jan 27 15:31:22 PST 2004 >>>>>>> delta = 4 Tue Jan 27 15:31:27 PST 2004 >>>>>>> delta = 4 Tue Jan 27 15:31:34 PST 2004 >>>>>>> delta = 6 Tue Jan 27 15:31:54 PST 2004 >>>>>>> delta = 6 Tue Jan 27 15:32:04 PST 2004 >>>>>>> delta = 20 Tue Jan 27 15:32:13 PST 2004 meaning, in that case, that I got a 20 seconds "freeze" of the watchdog script that ended at 15:32:13 Expected results: There should be no freezes like this, regardless of the load, as the watchdog script needs a very minimal number of cycles to execute every second. Additional info:
Created attachment 97484 [details] Source code of test program
Can you attach the watchdog script too please?
Created attachment 97498 [details] Watchdog script : detects lags of more than 2 seconds Forgot to post the watchdog script earlier