Description of problem: RHEL5 serial on all Intel platform with cpuid level > 4, cpuid.4 instrument will all 0 if we get cpu information by cpuid driver. Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2A: Instruction Set Reference say that cpuid.4 leaf output depends on ECX initial value, but RHEL5.2 x86_64 cpuid driver do not implement it. (RHEL52 i386 cpuid driver do not have this bug) Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Download test_cpuid4.c test case as attachment. 2. gcc -o test_cpuid4 test_cpuid4.c; 3. ./test_cpuid4 0 4. It will always report all 0. 5. Another way to reproduce the bug,we also can get run Dave's x86info, http://www.codemonkey.org.uk/projects/x86info/ 6. tar; make; ./x86info 7. It will report the wrong cores number for package. it is always report "Number of cores per physical package=1". Actual results: return all 0 for registers: eax,ebx,ecx,edx. Expected results: return the CPU cache information by registers: eax,ebx,ecx,edx. Additional info:
Created attachment 311558 [details] Test case test_cpuid4.c which get cpuid.4 information by cpuid driver
Created attachment 311559 [details] The patch to fix cpuid.4 bug.
The patch in Comment #2 just makes the cpuid() call into the cpuid_count() call, the patch makes the 2 functions identical. Since we already have this functionality it makes more sense to change the code to call cpuid_count() instead of cpuid(). More investigation reveals quite a few updates to arch/i386/kernel/cpuid.c upstream that are not in RHEL5. It seems likely that this is the code that should be fixed instead. That being said, can we confirm a newer upstream kernel (such as 2.6.25) works properly on this hardware? Can this be tested? Ideally this is fixed upstream and can just be backported. Worst case we can just call cpuid_count() instead of cpuid().
Yes. This bug just exit on RHEL5 serial X86_64, now upstream merge the x86_64 and i386. so the bug do not exist on upstream kernel. The upstream commit id 2347d933b158932cf2b8aeebae3e5cc16b200bd1 will fix the bug. But it need more backport effort.
Created attachment 313758 [details] test patch Test patch (backport from upstream). Can you please test this patch on the affected hardware and let me know if it helps any? Thanks.
Add add the patch, it will cause the following bug information in dmesg information. BUG: warning at arch/x86_64/kernel/smp.c:379/smp_call_function_single() (Tainted: G ) Call Trace: [<ffffffff8004c9fb>] smp_call_function_single+0x4f/0x10e [<ffffffff8002203b>] __up_read+0x19/0x7f [<ffffffff800668a2>] do_page_fault+0x4fe/0x830 [<ffffffff8005009f>] cpuid_read+0x80/0xbd [<ffffffff8000b3d2>] vfs_read+0xcb/0x171 [<ffffffff800117bf>] sys_read+0x45/0x6e [<ffffffff8005d28d>] tracesys+0xd5/0xe0
This crash information root cause is that the backport patch call smp_call_function_single. Because 2.6.18 kernel API smp_call_function_single impelmentation is different with upstream kernel 2.6.26 kernel API smp_call_function_single. The attachment cpuid.patch can fix this bug.
Created attachment 314897 [details] cpuid.patch which can fix the crash bug for backport patch RHEL5-cpuid-update.patch
Change the severity to high, the reason as following: cpuid driver can not support CPUID.4 CPUID.0xB CPUID.0xD instruments etc. because these instruments depend on ecx input except eax. Without the patch, cpuid driver will provide wrong information for some important CPU information: such as Maximun number addressed ID of CPU package, Cache type, Cache level, X2APIC ID and processor topology relationship etc. I have validate the patch, there is no issue found.
Created attachment 315667 [details] rework of initial test patch Here is a rework which combines the last 2 patches. Can this just be tested on the affected hardware for sanity sake? If results look good I will get the patch posted ASAP so it can be included in RHEL5.
Yes. I test it different machine, it also can fix the bug.
Here is the update test case for it. Run by command "test_cpuid [cpu] [eax] [ecx]"
Created attachment 315732 [details] test_cpuid.c test case.
Attachment 315667 [details]: RHEL5-cpuid-update.patch call "smp_call_function_single" which just exist on X86_64 but is not be define on i386. It will report error when compile in i386 kernel. I rework the patch and validate on x86_64 and i386 platform, the patch RHEL5_cpuid_new.patch will fix all the issues.
Created attachment 316164 [details] Rework patch RHEL5_cpuid_new.patch
Hi bmaly, This patch have pass test for different platform for both x86 and x86_64, please integerate it in RHEL5.3.
bmaly, What's the status for this bug?
Do the patch POST?
Brian, Any status on this? We at Intel would love to know. Thanks, John
(In reply to comment #10) > Created an attachment (id=315667) [details] > rework of initial test patch > > Here is a rework which combines the last 2 patches. Can this just be tested on > the affected hardware for sanity sake? > > If results look good I will get the patch posted ASAP so it can be included in > RHEL5. Bmaly, We have tested the patch and result is good. Will you put this patch into RHEL5.3? Thanks.
I wull push the patch into RHEL5 (z-stream and 5.4). Since this isnt upstream it should go into RHEL at the very beginning of the development cycle so we have enough testing coverage. Pushing this in at the the end of the cycle (i.e. the last possible moment) greatly increases the risk of regressions.
Updating PM score.
Brian, Are you going to post Youquan's patch? If does, please mark POST after posting. If not, please assign the bug owner to me.. I will help with review, testing, and posting. Thanks, Luming
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
in kernel-2.6.18-141.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified.
~~ Attention Partners RHEL 5.4 Partner Alpha Released! ~~ RHEL 5.4 Partner Alpha has been released on partners.redhat.com. There should be a fix present that addresses this particular request. Please test and report back your results here, at your earliest convenience. Our Public Beta release is just around the corner! If you encounter any issues, please set the bug back to the ASSIGNED state and describe the issues you encountered. If you have verified the request functions as expected, please set your Partner ID in the Partner field above to indicate successful test results. Do not flip the bug status to VERIFIED. Further questions can be directed to your Red Hat Partner Manager. Thanks!
Verified RHEL 5.4 alpah, it is fixed.
~~ Attention - RHEL 5.4 Beta Released! ~~ RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner! If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity. Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value. Questions can be posted to this bug or your customer or partner representative.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1243.html