Bug 472824
Summary: | sysfs doesn't export CPU cache info for some CPUs | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Chris Snook <csnook> |
Component: | kernel | Assignee: | Red Hat Kernel Manager <kernel-mgr> |
Status: | CLOSED WONTFIX | QA Contact: | Red Hat Kernel QE team <kernel-qe> |
Severity: | low | Docs Contact: | |
Priority: | low | ||
Version: | 5.3 | CC: | csnook, philip |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-06-02 13:07:12 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Chris Snook
2008-11-24 21:38:06 UTC
Just saw this again inside a KVM guest running on a Core 2 Duo E6850. Baremetal sees the sysfs file, both in F10 and RHEL 5.3 beta. Investigation into brc#511278 ============================= The /sys/devices/system/cpu/cpuX/cache/ sysfs directories are created in arch/i386/kernel/cpu/intel_cacheinfo.c (for both i386 and x86_64). The addition of the cache information for this processor is either failing because (num_cache_leaves == 0), or failing somewhere in cache_add_dev(), since it can be assumed that once register_hotcpu_notifier() is called, cacheinfo_cpu_callback() is correctly called as appropriate. The case that (num_cache_leaves == 0) is only possible if the CPUID[1] instruction is failing on the processor. This is the most likely problem, since there are other reports[2] of odd behaviour of the CPUID instruction on Xeon processors. This can be checked for by using the cpuid program in userspace, and noting its output. No other functions or esoteric instructions are called to initialise num_cache_leaves: do { ++i; /* Do cpuid(4) loop to find out num_cache_leaves */ cpuid_count(4, i, &eax, &ebx, &ecx, &edx); cache_eax.full = eax; } while (cache_eax.split.type != CACHE_TYPE_NULL); It is possible that adding the cache information is failing in cache_add_dev(). The only processor-dependent failure point here is the call to cpuid4_cache_sysfs_init(), which results in a call to detect_cache_attributes(). Here, either the set_cpus_allowed() call is failing (unlikely), or cpuid4_cache_lookup() is failing. This brings us round to the same conclusion: that the CPUID instruction, as issued by Linux, doesn't work on this particular model of processor. Digging into the exact CPUID call in cpuid_count() in include/asm-i386/processor.h, we have the following assembly: __asm__("cpuid" : "=a" (*eax), "=b" (*ebx), "=c" (*ecx), "=d" (*edx) : "0" (op), "c" (count)); The calls in intel_cacheinfo.c are the only ones in the kernel which set ecx to a non-zero value before issuing the CPUID instruction. Perhaps this is what's causing the problem? The information displayed in /proc/cpuinfo is returned by a call to CPUID with ecx=0, and that appears to work fine. More analysis would be possible if the cpuid program could be run on the system and its output provided. There is no indication that the problem's caused by the mismatch between clflush size and cache alignment. [1]: http://en.wikipedia.org/wiki/CPUID [2]: http://bugzilla.kernel.org/show_bug.cgi?id=11074 This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug. Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support). The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |