From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.2) Gecko/20021120 Netscape/7.01 Description of problem: we are always running low priority jobs in the background. The scheduler gets confused and ends up running jobs with nice +19 almost at top priority. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. launch 2 jobs with default priority 2. launch 4 jobs with nice +19 Actual Results: 213 processes: 203 sleeping, 8 running, 2 zombie, 0 stopped CPU0 states: 91.2% user, 8.4% system, 2.2% nice, 0.0% idle CPU1 states: 99.0% user, 0.1% system, 99.4% nice, 0.0% idle Mem: 513100K av, 504868K used, 8232K free, 0K shrd, 28092K buff Swap: 2096472K av, 186560K used, 1909912K free 223376K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 26495 darwin 25 0 48988 47M 436 R 47.1 9.5 8:30 darwin 26464 darwin 25 0 50524 49M 436 R 46.9 9.8 8:51 darwin 26614 gonnet 39 19 2416 2416 300 R N 33.4 0.4 4:08 Switch4.LI 26735 gonnet 39 19 2396 2396 300 R N 33.4 0.4 2:02 Switch4.LI 24154 gonnet 39 19 1548 1548 304 R N 33.2 0.3 21:50 Switch4.LI 25323 gonnet 39 19 2400 2400 304 R N 2.7 0.4 9:52 Switch4.LI . . . . Notice that the first two jobs have nice 0 and they get 100% of one cpu. The other 4 jobs have all nice=19 and get also 100% of the other cpu. Expected Results: Jobs should run at assigned priority. Additional info: Kernel 2.4.18-19.7.xsmp (athlon) on a Tyan Tiger MPX dual athlon motherboard.
We are seeing something similar. PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CO 2175 maw 39 19 47864 46M 2276 R N 99.2 4.6 6:51 cloudy.exe 15198 derosa 25 0 7140 7140 3532 R 50.2 0.6 116:51 xspec 1748 swa 25 0 21304 14M 1288 R 49.7 1.4 8:05 cosmomc Here two nice zero jobs are sharing one cpu with a nice 19 job having the other cpu to itself. Surely this is a scheduling bug? Kernel version is 2.4.18-24.7.xsmp for Athlon on redhat 7.3. Tyan Tiger MP, 2x Athlon MP processors.
I did an upgrade to 2.4.18-27.7.xsmp, the problem remains: 11:42am up 2 days, 2:01, 33 users, load average: 3.01, 3.29, 3.16 203 processes: 198 sleeping, 5 running, 0 zombie, 0 stopped CPU0 states: 96.2% user, 3.3% system, 11.3% nice, 0.0% idle CPU1 states: 95.0% user, 4.2% system, 94.1% nice, 0.0% idle Mem: 2064336K av, 1933252K used, 131084K free, 0K shrd, 110096K buff Swap: 2096472K av, 0K used, 2096472K free 1236936K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 10364 gonnet 39 19 2428 2428 304 R N 95.0 0.1 1:29 Switch4.LI 10405 gonnet 25 0 322M 322M 596 R 84.3 15.9 2:18 mapleTTY 10685 gonnet 39 19 2416 2416 296 R N 12.2 0.1 0:02 Switch4.LI 1917 root 5 -10 308M 51M 5116 S < 4.1 2.5 154:25 X 20918 gonnet 15 0 16604 16M 14804 R 2.1 0.8 0:14 kdeinit 10678 gonnet 15 0 1212 1212 916 R 1.1 0.0 0:00 top 20857 gonnet 15 0 13768 13M 13012 S 0.1 0.6 4:15 kdeinit 20891 gonnet 15 0 18060 17M 15264 S 0.1 0.8 1:44 kdeinit 20902 gonnet 15 0 16652 16M 14804 S 0.1 0.8 0:15 kdeinit 1 root 15 0 480 480 420 S 0.0 0.0 0:07 init You can see that the top process is at nice 19, while the one at nice 0 does not get as much cpu. In normal circustances, the nice 0 should get 100% and the other two at nice 19 50% each. Could someone look into this, please? Thanks.
The problem is gone in 2.4.20-13.7smp, I'm closing this bug. Thanks!