Bug 752715 - unbalance CPU load in CPU cgroup
Summary: unbalance CPU load in CPU cgroup
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.1
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Linda Wang
QA Contact: Kernel General QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-11-10 08:23 UTC by colyli
Modified: 2016-08-26 01:19 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-26 01:19:58 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description colyli 2011-11-10 08:23:19 UTC
Description of problem:
During a testing of CPU Cgroup, we observe load is not balanced among multiple cores.

Version-Release number of selected component (if applicable):
RHEL6.1 with kernel-2.6.32-131.12.1

How reproducible:
code 1: a CPU busy loop code, called bbb.c:
int main()
{
    while(1)
    {
    };
}
code 2: a bash script called test.sh:
#!/bin/sh
count=0
pids=" "
while [ $count -lt 32 ]
do
        mkdir /cgroup/$count
        echo 1024 > /cgroup/$count/cpu.shares
        ./bbb &
        pid=`echo $!`
        echo $pid > /cgroup/$count/tasks
        pids=`echo $pids" "$pid`
        count=`expr $count + 1`
done
echo "for pid in $pids;do cat /proc/$pid/sched|grep sum_exec_runtime;done" > show.sh
watch -n1 sh show.sh


Steps to Reproduce:
1. mount -t cgroup -o cpu cpuctl /cgroup
2. run script test.sh
3. observe processes by "top -d 1"
  
Actual results:
Some process may ocuupy 99% CPU time while some others only have 33%. Even we observe 2 or 3 processes occupy more than 90% CPU time all the time.
In long term, we see CPU time slices are averaged between all groups, but we observe processes are scheduled among different CPU cores to achive cpu-time-console. Which result response delay in my testing environment.

Expected results:
All groups should share CPU times. e.g. running 32 while-1-looping processes on 16 core machinese, we should see each process occupies ~50% CPU time, and no too much process migration between different cores.


Additional info:
1, for non-NUMA machinese, I don't observe the above issue. The issue can be easily reproduced on NUMA x86_64 machine.
2, I also tried the test case on vanilla kernel 3.1.0-rc9, the problem is *FIXED* in upstream kernel.
3, Now we are trying to figure out which patch solve the problem.

Comment 2 RHEL Program Management 2011-11-14 06:47:57 UTC
Since RHEL 6.2 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.


Note You need to log in before you can comment on or make changes to this bug.