| Summary: | uneven cpuset scheduling | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Travis Gummels <tgummels> | ||||||||
| Component: | kernel | Assignee: | Peter Zijlstra <pzijlstr> | ||||||||
| Status: | CLOSED NOTABUG | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||||||
| Severity: | high | Docs Contact: | |||||||||
| Priority: | high | ||||||||||
| Version: | 5.6 | CC: | james.brown, lwang, mgrondona, mingo | ||||||||
| Target Milestone: | rc | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | x86_64 | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2011-08-23 16:09:34 UTC | Type: | --- | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Attachments: |
|
||||||||||
=== In Red Hat Customer Portal Case 00412338 === --- Comment by Woodard, Ben on 1/31/2011 1:12 PM --- The problem doesn't appear with RHEL6. top - 13:12:00 up 1:04, 3 users, load average: 9.76, 5.52, 2.29 Tasks: 458 total, 11 running, 447 sleeping, 0 stopped, 0 zombie Cpu0 : 88.7%us, 10.9%sy, 0.0%ni, 0.0%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.4%us, 0.4%sy, 0.0%ni, 99.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu8 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu9 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu10 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu11 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu12 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu13 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu14 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu15 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu16 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu17 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu18 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu19 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu20 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu21 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu22 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu23 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 20459104k total, 1469176k used, 18989928k free, 24920k buffers Swap: 22691832k total, 0k used, 22691832k free, 715064k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6843 root 20 0 3828 352 276 R 100.0 0.0 4:21.00 busy 6846 root 20 0 3828 352 276 R 100.0 0.0 4:21.00 busy 6847 root 20 0 3828 352 276 R 100.0 0.0 3:27.40 busy 6850 root 20 0 3828 352 276 R 100.0 0.0 4:21.00 busy 6851 root 20 0 3828 348 276 R 100.0 0.0 4:21.00 busy 6852 root 20 0 3828 348 276 R 100.0 0.0 4:20.96 busy 6842 root 20 0 3828 352 276 R 99.8 0.0 4:20.99 busy 6844 root 20 0 3828 352 276 R 99.8 0.0 3:27.39 busy 6853 root 20 0 3828 352 276 R 99.8 0.0 4:20.96 busy 6860 root 20 0 430m 129m 12m R 99.8 0.6 0:28.77 yum === In Red Hat Customer Portal Case 00412338 === --- Comment by Woodard, Ben on 1/28/2011 3:37 PM --- This is with: 2.6.18-238.el5 a way to view this is split out the cores in top Then you see: Tasks: 401 total, 10 running, 391 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu8 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu9 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu10 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu11 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu12 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu13 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu14 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu15 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu16 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu17 : 0.3%us, 0.3%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu18 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu19 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu20 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu21 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu22 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu23 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st note how core 6 is unused. === In Red Hat Customer Portal Case 00412338 === --- Comment by Woodard, Ben on 1/28/2011 5:09 PM --- This is with the backport of the patch from http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=908a7c1b9b80d06708177432020c80d147754691;hp=cd79007634854f9e936e2369890f2512f94b8759 applied to the kernel top - 17:06:40 up 7 min, 2 users, load average: 7.30, 2.62, 0.95 Tasks: 401 total, 10 running, 391 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.7%us, 0.0%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu8 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu9 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu10 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu11 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu12 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu13 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu14 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu15 : 0.7%us, 0.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu16 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu17 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu18 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu19 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu20 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu21 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu22 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu23 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 20536192k total, 610488k used, 19925704k free, 31744k buffers Swap: 22577144k total, 0k used, 22577144k free, 378156k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6312 root 25 0 3664 324 256 R 100.1 0.0 1:40.78 busy 6313 root 25 0 3664 324 256 R 100.1 0.0 1:40.78 busy 6316 root 25 0 3664 324 256 R 100.1 0.0 1:40.78 busy 6318 root 25 0 3664 320 256 R 100.1 0.0 1:40.78 busy 6321 root 25 0 3664 324 256 R 100.1 0.0 1:40.77 busy 6324 root 25 0 3664 320 256 R 100.1 0.0 1:40.77 busy 6323 root 25 0 3664 320 256 R 99.8 0.0 1:40.76 busy 6322 root 25 0 3664 324 256 R 50.2 0.0 0:50.38 busy 6319 root 25 0 3664 324 256 R 49.9 0.0 0:50.39 busy 6341 root 15 0 13016 1540 944 R 0.0 0.0 0:00.23 top Note how one of the processors is unused and at the end there are two busy processes which are getting 50% time vs 100% time. Here is a detailed description of the case we are hitting on
our quad-core quad-socket opterons when we run with cpuset or
cpu affinity set to CPUs 3-11. Note that these CPUs extend across
3 numa nodes 0-3,4-7,8-11, but that only CPU3 is in the cpus_allowed
of the running tasks.
This is my current understanding and may be from slightly flawed to
way off base, sorry about any mistakes.
We end up in a state where the scheduler is running 3 tasks on each
of the groups 0-3,4-7,8-11. In this specific case, CPU4 and CPU8
are idle, while CPU3 has 3 tasks running on it, each getting 33% cpu.
Looking at find_busiest_group(), when called for sched domain 0-15
by CPU4, we have the following calculations in the do/while loop for
each group:
group 4-7 (this_group) nr_running=3 avg_load=384 group->cpu_power=128
group 8-11 nr_running=3 avg_load=384 group->cpu_power=128
group 12-15 nr_running=3 avg_load=0 group->cpu_power=128
group 0-3 nr_running=3 avg_load=384 group->cpu_power=128
Note that for this case, since the avg_load of all three busy groups
is the same, find_busiest_group will pick 8-11 as the busiest group
since it is the first case where
(avg_load > max_load && sum_nr_running > group_capacity) {
This is because sched_mc_power_savings is not set, so group_capacity == 1,
meaning that the scheduler will try to spread tasks around groups instead
of filling one group to its real capacity before migrating tasks to the
next group.
After picking group 8-11 as busiest, however, the scheduler rightly does
nothing since the load on 8-11 is the same as this group's load
(3 tasks total):
if (!busiest || this_load >= max_load || busiest_nr_running == 0)
goto out_balanced;
There is a patch that went in upstream that ostensibly tries to rectify
the situation where there is one unbalanced group due to cpus_allowed mask:
http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.37.y.git;a=commit;h=908a7c1b9b80d06708177432020c80d147754691
This patch detects a group that is likely imbalanced by checking
the difference between the most loaded and least loaded cpu in that
group, and if the difference is greater than SCHED_LOAD_SCALE (1 task),
then likely the group cannot balance itself and the patch sets a
group_imb flag that is used in addition to the test for
sum_nr_running > group_capacity to find the busiest group as seen
in this hunk:
@@ -2519,11 +2530,12 @@ find_busiest_group(struct sched_domain *
this_nr_running = sum_nr_running;
this_load_per_task = sum_weighted_load;
} else if (avg_load > max_load &&
- sum_nr_running > group_capacity) {
+ (sum_nr_running > group_capacity || __group_imb)) {
max_load = avg_load;
busiest = group;
busiest_nr_running = sum_nr_running;
busiest_load_per_task = sum_weighted_load;
+ group_imb = __group_imb;
}
#if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
However, this patch only works when sched_mc_power_savings is set,
because otherwise the group_capacity for all groups is 1, and
sum_nr_running is > group_capacity for all 3 loaded groups, so
group 8-11 is again detected as the "busiest" group.
In fact, even if group 0-3 _is_ detected as the busiest group,
find_busiest_group will still return out_balanced for this case
since in the conditional:
if (!busiest || this_load >= max_load || busiest_nr_running == 0)
goto out_balanced;
this_load >= max_load is always true since the load of all three
busy groups is the same (3 tasks).
What it seems like we need is a way to artificially increase the "load"
of the 0-3 group with CPU3 oversubscribed.
Created attachment 478143 [details]
packaged needed for reproducer
Created attachment 478144 [details]
second package needed for the reproducer
Created attachment 478145 [details]
reproducer script
This request was evaluated by Red Hat Product Management for inclusion in Red Hat Enterprise Linux 5.7 and Red Hat does not plan to fix this issue the currently developed update. Contact your manager or support representative in case you need to escalate this bug. okay. per comment#26. close this issue as won't fix.. |
Description of problem: LLNL is using cpuset's to confine a process to a specific set of cpu's. For some numbers of cpu's per set the behaviour is as expected with one process per cpu. For other numbers of cpu's per set they are seeing 1 or 2 of the cpu's idling while a few others are oversubscribed. This issue is negatively impacting LLNL production. Version-Release number of selected component (if applicable): Red Hat Enterprise Linux Server release 5.6 (Tikanga) Kernel 2.6.18-238.el5 on an x86_64 How reproducible: 1) Install pdsh, required for reproducer script below. 2) Start the reproducer script. 3) Monitor with top. Reproducer Script: # hype149 /tmp > cat cpuset-test.sh #!/bin/bash # Tweak CPUS and NCPUS to change outcome. CPUS=3-11 NCPUS=9 CPUSETDIR=/dev/cpuset TESTID=cpuset-test-$$ CPUSET=${CPUSETDIR}/${TESTID} cleanup () { [ -d $CPUSET ] || return # # Jump back out of cpuset # echo $$ > /dev/cpuset/tasks rmdir $CPUSET } die () { echo "cpuset-test: $#@" >&2; cleanup; exit 1; } mkdir $CPUSET || die "Failed to create cpuset at $CPUSET" echo $CPUS > $CPUSET/cpus || die "Failed to populate cpuset" cat /dev/cpuset/mems > $CPUSET/mems echo $$ > $CPUSET/tasks || die "Failed to add myself to $TESTID" # # Compile a busy loop: # echo "int main (int ac, char **av) { while (1) {}; }" >/tmp/busy.c gcc -o/tmp/busy /tmp/busy.c [ -f /tmp/busy ] || die "failed to create busy loop program" # # Execute NCPUS busy loops in parallel: # pdsh -f$NCPUS -w "[$CPUS]" -Rexec /tmp/busy cleanup # End Reproducer Script Actual results: For some number of cpu sets the processes are not evenly distributed. top - 17:06:40 up 7 min, 2 users, load average: 7.30, 2.62, 0.95 Tasks: 401 total, 10 running, 391 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.7%us, 0.0%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu8 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu9 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu10 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu11 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu12 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu13 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu14 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu15 : 0.7%us, 0.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Cpu16 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu17 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu18 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu19 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu20 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu21 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu22 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu23 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 20536192k total, 610488k used, 19925704k free, 31744k buffers Swap: 22577144k total, 0k used, 22577144k free, 378156k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6312 root 25 0 3664 324 256 R 100.1 0.0 1:40.78 busy 6313 root 25 0 3664 324 256 R 100.1 0.0 1:40.78 busy 6316 root 25 0 3664 324 256 R 100.1 0.0 1:40.78 busy 6318 root 25 0 3664 320 256 R 100.1 0.0 1:40.78 busy 6321 root 25 0 3664 324 256 R 100.1 0.0 1:40.77 busy 6324 root 25 0 3664 320 256 R 100.1 0.0 1:40.77 busy 6323 root 25 0 3664 320 256 R 99.8 0.0 1:40.76 busy 6322 root 25 0 3664 324 256 R 50.2 0.0 0:50.38 busy 6319 root 25 0 3664 324 256 R 49.9 0.0 0:50.39 busy 6341 root 15 0 13016 1540 944 R 0.0 0.0 0:00.23 top Expected results: Processes are evenly distributed across cpus in the cpuset. Additional info: