From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030312 Description of problem: Using the rawhide kernel above (and with the errata RedHat 7.3 kernel 2.4.18-27.7.x), CPU-bound nice +19 jobs are given too much cpu with respect to CPU-bound nice 0 jobs. For instance on a 433MHz PII (also tried 1400MHz Athlon & P4 2700MHz) 5:29pm up 35 min, 3 users, load average: 2.14, 1.80, 0.99 69 processes: 65 sleeping, 4 running, 0 zombie, 0 stopped CPU states: 90.6% user, 0.1% system, 9.1% nice, 0.0% idle Mem: 384772K av, 76876K used, 307896K free, 0K shrd, 7428K buff Swap: 1028152K av, 0K used, 1028152K free 25948K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 1497 jss 25 0 1000 1000 380 R 90.4 0.2 9:49 cpufloattest 1635 jss 39 19 956 956 340 R N 9.1 0.2 0:51 cpufloattest So 1/10 of the power goes to the nice 19 job, even though they are both cpu bound. There's no way of giving the nice 19 job lower priority (AFAIK). On a Compaq Tru64 4.1B system: load averages: 2.30, 2.23, 2.19 17:31:14 48 processes: 3 running, 14 sleeping, 31 idle CPU states: 99.5% user, 0.0% nice, 0.4% system, 0.0% idle Memory: Real: 77M/235M act/tot Virtual: 9M/1223M use/tot Free: 118M PID USERNAME PRI NICE SIZE RES STATE TIME CPU COMMAND 211277 jss 59 0 2120K 188K run 6:23 99.70% a.out 203876 jmb65 63 19 12M 9322K run 316:20 0.00% tree The nice 0 job gets 99.7% of the CPU - a far better deal. Can the scheduler be fixed to lower the priority of batch nice +19 jobs? Otherwise a patch to implement something like the SCHED_BATCH patch would be useful. (Also nice 10 jobs get 1/3 of the cpu - I'd suggest that's too much). Version-Release number of selected component (if applicable): kernel-2.4.20-8.1 How reproducible: Always Steps to Reproduce: 1. Run nice 0 foreground job 2. Run nice 19 background job 3. 19 job gets 6-10% of the CPU Additional info:
I tend to agree - processes which are niced at 19 are those which "if you can run, fine..." I run several things at 19 (time updates, random signature generation, etc.) that do not need any sort of priority (not 9% at least). This would probably include Ingo Molnar (author of the O(1) scheduler for the kernel - also works for Red Hat). Upstream? It does bring up a good point for the central kernel maintainers - gauging scheduling performance versus commerical UNIX kernels would be interesting to see how Linux fares out.
This situation is getting worse. Using 2.4.20-13.7 on 7.3 gives 15% to the niced-19 process on a P4. It would be great if RH were to include Ingo Molnar's SCHED_BATCH feature to allow background execution of jobs when there are no other jobs running.
I've contacted Ingo Molnar and he says he'll look into the nice 19 issue.