Bug 497938 - [RHEL4] Strange cpufreq Entries in sysfs on AMD Quad-Core Systems
[RHEL4] Strange cpufreq Entries in sysfs on AMD Quad-Core Systems
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.8
All Linux
high Severity high
: rc
: ---
Assigned To: Bhavna Sarathy
Red Hat Kernel QE team
:
Depends On:
Blocks: 497939
  Show dependency treegraph
 
Reported: 2009-04-27 20:29 EDT by CAI Qian
Modified: 2013-01-10 02:59 EST (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 497939 (view as bug list)
Environment:
Last Closed: 2011-01-26 23:01:06 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Patch to fix this issue by removing code that reads the MSRs (11.72 KB, patch)
2010-05-26 15:21 EDT, Mark Langsdorf
no flags Details | Diff
Patch to fix this issue by removing code that reads the MSRs (10.73 KB, patch)
2010-08-31 13:25 EDT, Mark Langsdorf
no flags Details | Diff

  None (edit)
Description CAI Qian 2009-04-27 20:29:21 EDT
Description of problem:
An AMD machine has 4 cores, but the affected_cpus file has only one.

# cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 2
model name      : Quad-Core AMD Opteron(tm) Processor 8356
stepping        : 3
cpu MHz         : 1400.000
cache size      : 512 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 4
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
pdpe1gb
rdtscp lm 3dnowext 3dnow pni monitor cx16 popcnt lahf_lm cmp_legacy svm
extapic
cr8_legacy altmovcr8 abm sse4a misalignsse 3dnowprefetch osvw
bogomips        : 4598.26
TLB size        : 1104 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate <NULL>   <-- strange NULL here.
...

# cat /sys/devices/system/cpu/cpu*/cpufreq/affected_cpus
0
10
11
12
13
14
15
16
17
18
19
1
20
21
22
23
24
25
26
27
28
29
2
30
31
3
4
5
6
7
8
9

# lsmod
...
powernow_k8            22753  1
...

# uname -ra
Linux sun-x4600m2-01.rhts.bos.redhat.com 2.6.9-88.ELlargesmp #1 SMP Mon Apr 13
19:33:40 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

sun-x4600m2-01.rhts.bos.redhat.com was working before until it has upgraded
from Dual-Core to Quad-Core CPUs. More details about how it was tested before,
please see,
https://bugzilla.redhat.com/show_bug.cgi?id=469647#c7

It looks like affected_cpus never work on AMD Quad-Core systems that I have
tested another two machines without luck.

amd-ma78gm-01.rhts.bos.redhat.com
model name      : AMD Phenom(tm) 9750 Quad-Core Processor

hp-xw9400-01.rhts.bos.redhat.com
model name      : Quad-Core AMD Opteron(tm) Processor 2382

Intel Quad-Core system is working without problem.

# uname -ra
Linux ibm-defiant.rhts.bos.redhat.com 2.6.9-89.ELsmp #1 SMP Mon Apr 20 10:33:05
EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

#  cat /sys/devices/system/cpu/*/cpufreq/affected_cpus
0 1 2 3
0 1 2 3
0 1 2 3
0 1 2 3

# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model  : 15
model name : Intel(R) Xeon(R) CPU           X5355  @ 2.66GHz
stepping : 7
cpu MHz  : 2331.000
cache size : 4096 KB
physical id : 0
siblings : 4
core id  : 0
cpu cores : 4
fpu  : yes
fpu_exception : yes
cpuid level : 10
wp  : yes
flags  : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm pni monitor
ds_cpl est tm2 cx16 xtpr lahf_lm
bogomips : 5323.70
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
...

   ------- Comment #19 From Jarod Wilson (jwilson@redhat.com) 2009-04-27 11:15:49 EDT -------
(In reply to comment #17)
> It looks like affected_cpus never work on AMD Quad-Core systems that I have
> tested another two machines without luck.
>
> amd-ma78gm-01.rhts.bos.redhat.com
> model name      : AMD Phenom(tm) 9750 Quad-Core Processor
>
> hp-xw9400-01.rhts.bos.redhat.com
> model name      : Quad-Core AMD Opteron(tm) Processor 2382

Hrm... Just looked at a quad-core amd box here which is running Fedora 10
(2.6.27.21-based kernel). affected_cpus for each core doesn't list anything but
itself there either...


Version-Release number of selected component (if applicable):
kernel-2.6.9-89.EL

How reproducible:
always

Steps to Reproduce:
1. look at /sys/devices/system/cpu/*/cpufreq/affected_cpus
  
Actual results:
List only one CPU core.

Expected results:
Should list 4 CPU cores in the same physical processor.

Additional info:
Dmesg and cpuinfo files on the system can be found at,

https://bugzilla.redhat.com/show_bug.cgi?id=465366#c15
https://bugzilla.redhat.com/show_bug.cgi?id=465366#c14
Comment 1 Matthew Garrett 2009-04-28 09:22:22 EDT
Are we sure that the individual cores in the quad core machines aren't scalable? affected_cpus appears correct on the dual core system I tested earlier.
Comment 2 Matthew Garrett 2009-04-28 09:32:38 EDT
On sun-x4600m2-01 I appear to be able to set the frequencies of the cores individually, which suggests that it's correct for affected_cpus to only contain one core. I think we can close this bug.
Comment 3 Jarod Wilson 2009-04-28 10:07:55 EDT
Indeed, it does seem the latest generation of Opteron procs can have their cores scaled individually. Didn't know that.

http://www.amd.com/us-en/Processors/ProductInformation/0,,30_118_8796_15224,00.html
Comment 4 Bhavna Sarathy 2009-04-28 10:11:53 EDT
Affected_cpu's is defined as all the CPUs that are changed when any CPU in 
that group is changed. Greyhound cores have independent frequency control,
so each core should have its own number and not effect any other cores.

Not a bug.
Comment 5 CAI Qian 2009-04-28 10:23:07 EDT
One thing is strange on this system is that some cores even have different available frequences, is that normal?

# cat /sys/devices/system/cpu/*/cpufreq/scaling_available_frequencies
2300000 2000000 1700000 1400000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000
2300000 2000000 1700000 1400000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000
2300000 2000000 1700000 1400000 1200000

but they are the same type of CPUs as far as I can tell.

# grep  8356 /proc/cpuinfo
model name      : Quad-Core AMD Opteron(tm) Processor 8356
...

# grep  8356 /proc/cpuinfo | wc -l
32
Comment 6 CAI Qian 2009-04-28 10:30:10 EDT
Another strange thing is that we can change the frequency to some values should not be allowed.

# cat /sys/devices/system/cpu/cpu28/cpufreq/scaling_cur_freq
1150000

# cat /sys/devices/system/cpu/cpu28/cpufreq/scaling_available_frequencies
2300000 2000000 1700000 1400000 1200000
Comment 7 CAI Qian 2009-04-28 10:44:42 EDT
The following output might be easier to follow.
# cat */scaling_available_frequencies
cpu0: 2300000 2000000 1700000 1400000
cpu1: 2300000 2000000 1700000 1400000 1200000
cpu2: 2300000 2000000 1700000 1400000 1200000
cpu3: 2300000 2000000 1700000 1400000 1200000
cpu4: 2300000 2000000 1700000 1400000 1200000
cpu5: 2300000 2000000 1700000 1400000 1200000
cpu6: 2300000 2000000 1700000 1400000 1200000
cpu7: 2300000 2000000 1700000 1400000 1200000
cpu8: 2300000 2000000 1700000 1400000 1200000
cpu9: 2300000 2000000 1700000 1400000 1200000
cpu10: 2300000 2000000 1700000 1400000 1200000
cpu11: 2300000 2000000 1700000 1400000 1200000
cpu12: 2300000 2000000 1700000 1400000 1200000
cpu13: 2300000 2000000 1700000 1400000 1200000
cpu14: 2300000 2000000 1700000 1400000 1200000
cpu15: 2300000 2000000 1700000 1400000 1200000
cpu16: 2300000 2000000 1700000 1400000 1200000
cpu17: 2300000 2000000 1700000 1400000 1200000
cpu18: 2300000 2000000 1700000 1400000 1200000
cpu19: 2300000 2000000 1700000 1400000 1200000
cpu20: 2300000 2000000 1700000 1400000 1200000
cpu21: 2300000 2000000 1700000 1400000 1200000
cpu22: 2300000 2000000 1700000 1400000 1200000
cpu23: 2300000 2000000 1700000 1400000 1200000
cpu24: 2300000 2000000 1700000 1400000 1200000
cpu25: 2300000 2000000 1700000 1400000 1200000
cpu26: 2300000 2000000 1700000 1400000 1200000
cpu27: 2300000 2000000 1700000 1400000 1200000
cpu28: 2300000 2000000 1700000 1400000 1200000
cpu29: 2300000 2000000 1700000 1400000
cpu30: 2300000 2000000 1700000 1400000
cpu31: 2300000 2000000 1700000 1400000
Comment 8 CAI Qian 2009-04-29 02:40:23 EDT
Do you think the above 2 problems,
- different avaiable frequencies for cores in the same package - comment #7
- can change the frequency to something should not allowed - comment #6
might a bug in hardware/BIOS?
Comment 9 Bhavna Sarathy 2009-04-29 09:40:31 EDT
It's defintely looking that way.   I tested on my quad core system with 8 cores and didn't see the issues you list.  I certainly don't have access to a 32 core system.  Please look into upgrading the BIOS on your system and retest.
Comment 10 CAI Qian 2009-05-07 03:55:58 EDT
Bhavna, Andrew Crosson has helped upgrade the system to,

BIOS 126 (0ABIT126) and the ILOM  is updated to firmware 3.0.3.31, Build 42822.

However, the problem is still here.

- different avaiable frequencies for cores in the same package - comment #7
- can change the frequency to something should not allowed - comment #6
Comment 11 CAI Qian 2009-05-07 03:59:46 EDT
The affected machine is in RHTS, so you can reserve it from there. Let me know if need any assistant.

sun-x4600m2-01.rhts.bos.redhat.com
Comment 12 Bhavna Sarathy 2009-05-12 15:51:56 EDT
(In reply to comment #6)
> Another strange thing is that we can change the frequency to some values should
> not be allowed.
> # cat /sys/devices/system/cpu/cpu28/cpufreq/scaling_cur_freq
> 1150000
> # cat /sys/devices/system/cpu/cpu28/cpufreq/scaling_available_frequencies
> 2300000 2000000 1700000 1400000 1200000  

Can you clarify your statement? I'm not sure if you mean you can change the frequencies and it shouldn't be allowed or something else.
Comment 13 CAI Qian 2009-05-12 20:47:53 EDT
Bhavna, sorry I have not made myself clearly. what I mean was that why one of CPUs was running in the speed of 1150000, which was not one of values in scaling_available_frequencies as you can see from the above?
Comment 14 Subhendu Ghosh 2009-05-13 01:00:58 EDT
Is this still an issue?
Comment 15 CAI Qian 2009-05-13 01:14:09 EDT
The problem at the moment even after the firmware upgrade is,

- different avaiable frequencies for cores in the same package - comment #7
- can running into the frequency to something not listed - comment #6
Comment 16 Zhang Kexin 2009-06-09 06:44:49 EDT
Hi Bhavna,

I looked into the function fill_powernow_table_pstate(), and found that it get the info from rdmsr(MSR_PSTATE_DEF_BASE + index, lo, hi); then did some computation to get the frequency.

static int fill_powernow_table_pstate(struct powernow_k8_data *data, struct cpufreq_frequency_table *powernow_table)
{
-------------cut-------------------------
                rdmsr(MSR_PSTATE_DEF_BASE + index, lo, hi);
                if (!(hi & HW_PSTATE_VALID_MASK)) {
                        printk("==invalid pstate %d, ignoring\n", index);
                        powernow_table[i].frequency = CPUFREQ_ENTRY_INVALID;
                        continue;
                }
                
                fid = lo & HW_PSTATE_FID_MASK;
                did = (lo & HW_PSTATE_DID_MASK) >> HW_PSTATE_DID_SHIFT;

                powernow_table[i].index = index | (fid << HW_FID_INDEX_SHIFT) | (did << HW_DID_INDEX_SHIFT);

                powernow_table[i].frequency = find_khz_freq_from_fiddid(fid, did);
---------------------cut-----------------------
}

don't know why, when index is 4, "lo" got from rdmsr does not always the same, 28 out of 32 it is 0x8, but the other 4 it equals to 0x7, this leads to the computed freq got from find_khz_freq_from_fiddid does not equal to which got by ACPI. this is why "cat */scaling_available_frequencies" displays disaccord.

while in RHEL5.4 kernel, fill_powernow_table_pstate did a big change, and freq is no longer got through rdmsr(although rdmsr is still invoked, but it is not used to get freq). 

Bhavna, is there possibility that the cpu MSR does not work properly? need to replace the CPU?
Comment 17 Russell Doty 2009-06-09 15:19:53 EDT
Is this occurring on RHEL 5, or just on RHEL 4?
Comment 18 Zhang Kexin 2009-06-09 22:13:04 EDT
on latest RHEL5.4, kernel is 2.6.18-152.el5, freqs for all cpus are same:

[root@sun-x4600m2-01 ~]# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_available_frequencies 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000 
2300000 2000000 1700000 1400000 1200000
Comment 19 RHEL Product and Program Management 2010-04-19 11:44:08 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 20 Mark Langsdorf 2010-05-26 15:21:02 EDT
Created attachment 416988 [details]
Patch to fix this issue by removing code that reads the MSRs

This driver hasn't been updated in a while, so there was a fair bit of old code in it that needed to be removed.

This patch fixes the reported problem by removing some MSR reads against some non-architectural MSRs that would change between processor revisions.  Instead, it relies on the ACPI _PSS objects to get frequency information.
Comment 21 Mark Langsdorf 2010-08-31 13:25:10 EDT
Created attachment 442240 [details]
Patch to fix this issue by removing code that reads the MSRs

Fixed some formatting errors in the patch.

Note You need to log in before you can comment on or make changes to this bug.