Bug 472523 - AMD: Panic if cpu_khz is incorrect
Summary: AMD: Panic if cpu_khz is incorrect
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.4
Hardware: x86_64
OS: Linux
high
medium
Target Milestone: rc
: ---
Assignee: Prarit Bhargava
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks: 483701 485920
TreeView+ depends on / blocked
 
Reported: 2008-11-21 14:22 UTC by Prarit Bhargava
Modified: 2009-09-03 13:46 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-09-02 08:33:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
RHEL5 fix for this issue (990 bytes, patch)
2008-11-24 13:50 UTC, Prarit Bhargava
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2009:1243 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.4 kernel security and bug fix update 2009-09-01 08:53:34 UTC

Description Prarit Bhargava 2008-11-21 14:22:19 UTC
Description of problem:

After code inspection it was discovered that new(ish) AMD processors could boot with an incorrect value for cpu_khz.  This in turn leads to an incorrect value for tsc_khz which then leads to significant problems on the system.

Version-Release number of selected component (if applicable): -124.el5


How reproducible: > 1% of the time


Additional info: The code in question was modified in 467782.  With the new code if a perfctr cannot be reserved the code simply uses PERFCTR3 -- even if it is busy.

If it is busy, the result for cpu_khz is questionable.

In this case we should simply panic() and output a message to the user to reboot because of a HW error.

I have pushed a patch upstream http://marc.info/?l=linux-kernel&m=122651496115998&w=2
which outputs a printk warning to the user.

In the Enterprise space, however, I think we should panic.

Comment 1 Prarit Bhargava 2008-11-24 13:50:05 UTC
Created attachment 324472 [details]
RHEL5 fix for this issue

Comment 2 RHEL Program Management 2009-02-11 10:10:09 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 3 RHEL Program Management 2009-02-16 15:06:03 UTC
Updating PM score.

Comment 5 Don Zickus 2009-04-06 21:16:54 UTC
in kernel-2.6.18-138.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.

Comment 10 Caspar Zhang 2009-08-06 05:18:15 UTC
I've tested it in the old kernel, I record the bogomips value of cpuinfo, then restart the machine. I tested for 314 times and all of the bogomips value are between 4400 to 4500 except one(it's 4332).

Then I tested it in the new kernel(160.el5), I tested for 334 times and no abnormal bogomips value appeared. I'll keep the machine running to try to produce an incorrect value.

I leave this bug ON_QA and do code review to the patch.

Comment 12 errata-xmlrpc 2009-09-02 08:33:53 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1243.html


Note You need to log in before you can comment on or make changes to this bug.