Red Hat Bugzilla – Bug 465366
add multi-core support to cpufreq driver
Last modified: 2009-05-18 15:05:58 EDT
Description of problem: cpufreq driver needs multi-core support Version-Release number of selected component (if applicable): <= 4.7 How reproducible: run cpuspeed (userspace daemon), the cpu's do not scale properly Steps to Reproduce: 1. select 'userspace' governor 2. start cpuspeed daemon 3. load machine, remove load, observe cpu freqs Actual results: cpu freq gets stuck at a fixed freq for some processors with no load Expected results: cpu freq should scale up and down for all processors Additional info: affected hardware: IBM X3n50 this BZ provides the needed support for affected_cpus (/sys/devices/system/cpu/cpu*/cpufreq/affected_cpus) which the cpuspeed daemon needs to scale all cores properly. The userspace portion of this is Bug 451119
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Committed in 78.17.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
*** Bug 268441 has been marked as a duplicate of this bug. ***
Whey trying to verify this bug, we found a problem : https://bugzilla.redhat.com/show_bug.cgi?id=497490
Apart from the above problem which looks like in the userspace tool, there looks like another problem in affected_cpus patch. An AMD machine has 4 cores, but the affected_cpus file has only one. # cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 16 model : 2 model name : Quad-Core AMD Opteron(tm) Processor 8356 stepping : 3 cpu MHz : 1400.000 cache size : 512 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy altmovcr8 abm sse4a misalignsse 3dnowprefetch osvw bogomips : 4598.26 TLB size : 1104 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm stc 100mhzsteps hwpstate <NULL> <-- strange NULL here. ... # cat /sys/devices/system/cpu/cpu*/cpufreq/affected_cpus 0 10 11 12 13 14 15 16 17 18 19 1 20 21 22 23 24 25 26 27 28 29 2 30 31 3 4 5 6 7 8 9 # lsmod ... powernow_k8 22753 1 ... # uname -ra Linux sun-x4600m2-01.rhts.bos.redhat.com 2.6.9-88.ELlargesmp #1 SMP Mon Apr 13 19:33:40 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux I'll put this to ASSIGNED to have developers to look at the above 2 issues.
Created attachment 341336 [details] /proc/cpuinfo
Created attachment 341337 [details] /var/log/dmesg
sun-x4600m2-01.rhts.bos.redhat.com was working before until it has upgraded from Dual-Core to Quad-Core CPUs. More details about how it was tested before, please see, https://bugzilla.redhat.com/show_bug.cgi?id=469647#c7
It looks like affected_cpus never work on AMD Quad-Core systems that I have tested another two machines without luck. amd-ma78gm-01.rhts.bos.redhat.com model name : AMD Phenom(tm) 9750 Quad-Core Processor hp-xw9400-01.rhts.bos.redhat.com model name : Quad-Core AMD Opteron(tm) Processor 2382
Intel Quad-Core system is working without problem. # uname -ra Linux ibm-defiant.rhts.bos.redhat.com 2.6.9-89.ELsmp #1 SMP Mon Apr 20 10:33:05 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux # cat /sys/devices/system/cpu/*/cpufreq/affected_cpus 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 # cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU X5355 @ 2.66GHz stepping : 7 cpu MHz : 2331.000 cache size : 4096 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm pni monitor ds_cpl est tm2 cx16 xtpr lahf_lm bogomips : 5323.70 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: ...
(In reply to comment #17) > It looks like affected_cpus never work on AMD Quad-Core systems that I have > tested another two machines without luck. > > amd-ma78gm-01.rhts.bos.redhat.com > model name : AMD Phenom(tm) 9750 Quad-Core Processor > > hp-xw9400-01.rhts.bos.redhat.com > model name : Quad-Core AMD Opteron(tm) Processor 2382 Hrm... Just looked at a quad-core amd box here which is running Fedora 10 (2.6.27.21-based kernel). affected_cpus for each core doesn't list anything but itself there either...
Thanks Jarod! I think in this situation, I open new bugs for this issue, and move this back to ON_QA, because at least the backport work seems correct. Bug 497938 - [RHEL4] Affected_cpus Not Working for AMD Quad-Core Systems Bug 497939 - [RHEL5] Affected_cpus Not Working for AMD Quad-Core Systems
On ibm-defiant.rhts.bos.redhat.com, there is a quad-core Intel CPU: ... processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU X5355 @ 2.66GHz stepping : 7 cpu MHz : 1998.000 cache size : 4096 KB physical id : 0 siblings : 4 core id : 3 cpu cores : 4 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl est tm2 xtpr bogomips : 5320.14 [root@ibm-defiant ~]# tail /sys/devices/system/cpu/cpu0/cpufreq/* ==> /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq <== 1998000 ==> /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq <== 2664000 ==> /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq <== 1998000 ==> /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies <== 2664000 2331000 2331000 1998000 ==> /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors <== powersave userspace performance ==> /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq <== 1998000 ==> /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver <== centrino ==> /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor <== userspace ==> /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq <== 2664000 ==> /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq <== 1998000 ==> /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed <== 1998000 With old kernel 2.6.9-78.ELsmp and kernel-utils-2.4-14.1.117, the reproduce procedure is the following, First, all CPUs are idle: [root@ibm-defiant ~]# mpstat -P ALL Linux 2.6.9-78.ELsmp (ibm-defiant.rhts.bos.redhat.com) 04/28/2009 06:26:14 AM CPU %user %nice %system %iowait %irq %soft %idle intr/s 06:26:14 AM all 0.11 0.38 0.24 0.25 0.00 0.00 99.01 1020.51 06:26:14 AM 0 0.23 0.22 0.27 0.16 0.00 0.00 99.13 256.53 06:26:14 AM 1 0.10 0.34 0.23 0.17 0.00 0.00 99.16 258.52 06:26:14 AM 2 0.05 0.53 0.23 0.36 0.00 0.00 98.82 253.04 06:26:14 AM 3 0.06 0.44 0.22 0.31 0.00 0.00 98.96 252.42 and the frequencies are the minimum: [root@ibm-defiant ~]# cat /sys/devices/system/cpu/*/cpufreq/scaling_cur_freq 1998000 1998000 1998000 1998000 Then let's add some load on them by running two perl -e '$a = 1_000_000_000_000; while ($a--) {};' on the machine: [root@ibm-defiant ~]# mpstat -P ALL 1 Linux 2.6.9-78.ELsmp (ibm-defiant.rhts.bos.redhat.com) 04/28/2009 07:00:24 AM CPU %user %nice %system %iowait %irq %soft %idle intr/s 07:00:25 AM all 50.00 0.00 0.00 0.00 0.00 0.00 50.00 1011.00 07:00:25 AM 0 0.00 0.00 0.00 0.00 0.00 0.00 100.00 9.00 07:00:25 AM 1 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:00:25 AM 2 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:00:25 AM 3 0.00 0.00 0.00 0.00 0.00 0.00 100.00 1001.00 [root@ibm-defiant ~]# cat /sys/devices/system/cpu/*/cpufreq/scaling_cur_freq 2664000 2664000 2664000 2664000 All CPUs is running at the maximum. After killing a 'perl': [root@ibm-defiant ~]# mpstat -P ALL 1 Linux 2.6.9-78.ELsmp (ibm-defiant.rhts.bos.redhat.com) 04/28/2009 07:03:14 AM CPU %user %nice %system %iowait %irq %soft %idle intr/s 07:03:15 AM all 25.38 0.00 0.00 0.00 0.00 0.00 74.62 1074.23 07:03:15 AM 0 0.00 0.00 0.00 0.00 0.00 0.00 100.00 1055.67 07:03:15 AM 1 0.00 0.00 0.00 0.00 0.00 0.00 98.97 18.56 07:03:15 AM 2 104.12 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:03:15 AM 3 0.00 0.00 0.00 0.00 0.00 0.00 103.09 0.00 [root@ibm-defiant ~]# cat /sys/devices/system/cpu/*/cpufreq/scaling_cur_freq 2331000 2331000 2331000 2331000 All CPUs are running at a middle speed. With 2.6.9-89.ELsmp and kernel-utils-2.4-17.el4: CPUs run at the minimum speed when are idle: [root@ibm-defiant ~]# cat /sys/devices/system/cpu/*/cpufreq/scaling_cur_freq 1998000 1998000 1998000 1998000 [root@ibm-defiant ~]# mpstat -P ALL 1 Linux 2.6.9-89.ELsmp (ibm-defiant.rhts.bos.redhat.com) 04/28/2009 07:17:54 AM CPU %user %nice %system %iowait %irq %soft %idle intr/s 07:17:55 AM all 0.00 0.00 0.00 0.00 0.00 0.00 100.00 1019.80 07:17:55 AM 0 0.00 0.00 0.00 0.00 0.00 0.00 100.00 1000.99 07:17:55 AM 1 0.00 0.00 0.00 0.00 0.00 0.00 99.01 17.82 07:17:55 AM 2 0.00 0.00 0.00 0.00 0.00 0.00 100.00 0.00 07:17:55 AM 3 0.00 0.00 0.00 0.00 0.00 0.00 99.01 0.00 Add some load by running two 'perl': [root@ibm-defiant ~]# cat /sys/devices/system/cpu/*/cpufreq/scaling_cur_freq 2664000 2664000 2664000 2664000 [root@ibm-defiant ~]# mpstat -P ALL 1 Linux 2.6.9-89.ELsmp (ibm-defiant.rhts.bos.redhat.com) 04/28/2009 07:19:09 AM CPU %user %nice %system %iowait %irq %soft %idle intr/s 07:19:10 AM all 50.00 0.00 0.00 0.00 0.00 0.00 50.00 1000.00 07:19:10 AM 0 100.00 0.00 0.00 0.00 0.00 0.00 0.00 6.93 07:19:10 AM 1 0.00 0.00 0.00 0.00 0.00 0.00 99.01 0.00 07:19:10 AM 2 99.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:19:10 AM 3 0.00 0.00 0.00 0.00 0.00 0.00 99.01 992.08 All run at the maximum speed. Kill one 'perl': [root@ibm-defiant ~]# cat /sys/devices/system/cpu/*/cpufreq/scaling_cur_freq 2664000 2664000 2664000 2664000 [root@ibm-defiant ~]# mpstat -P ALL 1 Linux 2.6.9-89.ELsmp (ibm-defiant.rhts.bos.redhat.com) 04/28/2009 07:22:20 AM CPU %user %nice %system %iowait %irq %soft %idle intr/s 07:22:21 AM all 25.25 0.00 0.00 0.00 0.00 0.00 74.75 1051.02 07:22:21 AM 0 0.00 0.00 0.00 0.00 0.00 0.00 100.00 10.20 07:22:21 AM 1 0.00 0.00 0.00 0.00 0.00 0.00 100.00 18.37 07:22:21 AM 2 102.04 0.00 0.00 0.00 0.00 0.00 0.00 416.33 07:22:21 AM 3 0.00 0.00 0.00 0.00 0.00 0.00 102.04 605.10 Still running at the maximum speed. And the contents of 'affected_cpus' are correct: [root@ibm-defiant ~]# cat /sys/devices/system/cpu/*/cpufreq/affected_cpus 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 Kill all 'perl': [root@ibm-defiant ~]# cat /sys/devices/system/cpu/*/cpufreq/scaling_cur_freq 1998000 1998000 1998000 1998000 [root@ibm-defiant ~]# mpstat -P ALL 1 Linux 2.6.9-89.ELsmp (ibm-defiant.rhts.bos.redhat.com) 04/28/2009 07:24:28 AM CPU %user %nice %system %iowait %irq %soft %idle intr/s 07:24:29 AM all 0.00 0.00 0.00 0.00 0.00 0.00 100.00 999.01 07:24:29 AM 0 0.00 0.00 0.00 0.00 0.00 0.00 100.00 6.93 07:24:29 AM 1 0.00 0.00 0.00 0.00 0.00 0.00 100.00 0.00 07:24:29 AM 2 0.00 0.00 0.00 0.00 0.00 0.00 99.01 0.00 07:24:29 AM 3 0.00 0.00 0.00 0.00 0.00 0.00 99.01 991.09 All speed drop to the minimum.
Patch is in -89.EL kernel.
Will change the status to VERIFIED.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1024.html