Bug 173506 - Incl. workaround from 2.6.14rc2 for buggy TLB flush problem on SMP Opteron
Summary: Incl. workaround from 2.6.14rc2 for buggy TLB flush problem on SMP Opteron
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.0
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
: ---
Assignee: Jim Paradis
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: 170416
TreeView+ depends on / blocked
 
Reported: 2005-11-17 18:15 UTC by Lon Hohberger
Modified: 2013-08-06 01:17 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-01-18 20:56:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Shell script to auto-apply the workaround for testing (1.82 KB, text/plain)
2005-11-18 20:09 UTC, Lon Hohberger
no flags Details

Description Lon Hohberger 2005-11-17 18:15:48 UTC
Description of problem:

Bad pmd messages + random segfaults of applications.  Patch (incl. comments
about what it does) here:

http://lkml.org/lkml/2005/9/20/207

Version-Release number of selected component (if applicable): RHEL4-U2

How reproducible: Random

Actual results: Random application segfaults on some Opteron SMP machines

Additional info: The patch is addressed in 2.6.14rc2.  We may be able to address
this at boot time from userland; Dave Jones has a program which can apply the
workaround in userland if the MSR kernel module is loaded.

Comment 3 Lon Hohberger 2005-11-18 20:09:40 UTC
Created attachment 121253 [details]
Shell script to auto-apply the workaround for testing

This is not a long-term solution.  This is based on the errata122.c which Dave
sent me.  It creates MSR devices, compiles the inline-attached code (which has
a little more error checking, and does the equivalent of a test-and-set on the
MSR data), and applies it to each CPU found.  It's crude.  I tested it on 2P
RHEL3U6 (after 'modprobe msr') and 4P RHEL4U2.

[root@bigisis ~]# ./amd-tlb-workaround.sh
AMD TLB filter flush workaround (auto-apply errata 122 workaround)
Checking Processor ID: 0
Applying workaround to /dev/msr0
Checking Processor ID: 1
Applying workaround to /dev/msr1
Checking Processor ID: 2
Applying workaround to /dev/msr2
Checking Processor ID: 3
Applying workaround to /dev/msr3
[root@bigisis ~]# ./amd-tlb-workaround.sh
AMD TLB filter flush workaround (auto-apply errata 122 workaround)
Checking Processor ID: 0
Workaround already applied to /dev/msr0
Checking Processor ID: 1
Workaround already applied to /dev/msr1
Checking Processor ID: 2
Workaround already applied to /dev/msr2
Checking Processor ID: 3
Workaround already applied to /dev/msr3

Comment 9 Lon Hohberger 2006-01-18 20:56:17 UTC
Closing.  Upon reproduction, this is not the problem.

Setting to WONTFIX, because the fix isn't necessary.


Note You need to log in before you can comment on or make changes to this bug.