Bug 173506 - Incl. workaround from 2.6.14rc2 for buggy TLB flush problem on SMP Opteron
Incl. workaround from 2.6.14rc2 for buggy TLB flush problem on SMP Opteron
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.0
x86_64 Linux
medium Severity high
: ---
: ---
Assigned To: Jim Paradis
Brian Brock
:
Depends On:
Blocks: 170416
  Show dependency treegraph
 
Reported: 2005-11-17 13:15 EST by Lon Hohberger
Modified: 2013-08-05 21:17 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-01-18 15:56:17 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Shell script to auto-apply the workaround for testing (1.82 KB, text/plain)
2005-11-18 15:09 EST, Lon Hohberger
no flags Details

  None (edit)
Description Lon Hohberger 2005-11-17 13:15:48 EST
Description of problem:

Bad pmd messages + random segfaults of applications.  Patch (incl. comments
about what it does) here:

http://lkml.org/lkml/2005/9/20/207

Version-Release number of selected component (if applicable): RHEL4-U2

How reproducible: Random

Actual results: Random application segfaults on some Opteron SMP machines

Additional info: The patch is addressed in 2.6.14rc2.  We may be able to address
this at boot time from userland; Dave Jones has a program which can apply the
workaround in userland if the MSR kernel module is loaded.
Comment 3 Lon Hohberger 2005-11-18 15:09:40 EST
Created attachment 121253 [details]
Shell script to auto-apply the workaround for testing

This is not a long-term solution.  This is based on the errata122.c which Dave
sent me.  It creates MSR devices, compiles the inline-attached code (which has
a little more error checking, and does the equivalent of a test-and-set on the
MSR data), and applies it to each CPU found.  It's crude.  I tested it on 2P
RHEL3U6 (after 'modprobe msr') and 4P RHEL4U2.

[root@bigisis ~]# ./amd-tlb-workaround.sh
AMD TLB filter flush workaround (auto-apply errata 122 workaround)
Checking Processor ID: 0
Applying workaround to /dev/msr0
Checking Processor ID: 1
Applying workaround to /dev/msr1
Checking Processor ID: 2
Applying workaround to /dev/msr2
Checking Processor ID: 3
Applying workaround to /dev/msr3
[root@bigisis ~]# ./amd-tlb-workaround.sh
AMD TLB filter flush workaround (auto-apply errata 122 workaround)
Checking Processor ID: 0
Workaround already applied to /dev/msr0
Checking Processor ID: 1
Workaround already applied to /dev/msr1
Checking Processor ID: 2
Workaround already applied to /dev/msr2
Checking Processor ID: 3
Workaround already applied to /dev/msr3
Comment 9 Lon Hohberger 2006-01-18 15:56:17 EST
Closing.  Upon reproduction, this is not the problem.

Setting to WONTFIX, because the fix isn't necessary.

Note You need to log in before you can comment on or make changes to this bug.