Red Hat Bugzilla – Bug 202130
Tyan system hangs and keyboard lights blink even when not out of resources
Last modified: 2007-11-30 17:07:27 EST
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; InfoPath.1)
Description of problem:
Processor/s: Dual Processor - Dual AMD Opteron 285 2.6
GHz 64-Bit w/ Dual Core Technology
Motherboard: TyanÂ® Thunder K8WE Motherboard w/ SLI Support
8 GB RAM, 8 GB swap
The machine seems to run fine when not loaded, but when
performing a CFD calculation that uses about 4 GB of RAM
it will lock up at random times where it does not respond
and the caps lock and scroll lock light will flash on and
off together about every second.
System will run longer with two iterations than with four.
Has tried disabling dual core no improvement.
Sometimes when the system locks up the keyboard lights do not flash.
Dual boot windows and Linux system...
Version-Release number of selected component (if applicable):
Steps to Reproduce:
2. run job(s) to about 4.6G RAM useage
3. Computer hangs with lights sometims blinking sometimes not.
Computer hangs with lights sometims blinking sometimes not.
should be able to run through the job(s) properly.
Runs a job fine on same system under windows.
*** Bug 202131 has been marked as a duplicate of this bug. ***
Can you please post any messages from /var/log/messages, that appear at the time
of the crash. When the machine locks up, can you do alt-sysrq-t. this will
hopefully dump the state of all the processess. thanks.
For some reason everything after the 1.01 BIOS on this set up the interrupts as
edge triggered instead of level triggered. Only the 1.01 bios will work under
Tyan says that if you use a kernel later than 2.6.14 the interrupts are set up
correctly. I haven't seen anything that important change between 2.6.13 and
2.6.14 so i couldn't swear that they didn't just change a something in the
Anyway, I'm working with their BIOS team and hopefully they'll get this fixed soon.
Created attachment 134075 [details]
patch to match upstream kernel
I think I've got it... This is the same as the upstream kernel.
The PCI devices in /proc/interrupts are level triggered now.
I have updated the kernel to version 2.6.9-42.ELsmp that
was released last week. The machine still exibited the
lockup issue with the new kernel. It still has bios
version 1.03, but by default the HT-LDT Frequency is set
to auto which when the machine boots it says the HT-LDT
Frequency is 1000MHz. I changed the setting in the bios
from auto to 800MHz and it seems to run now, as my
calculations ran all night last night and today with no
problems. What is this HT-LDT freqency? Thanks
The patch listed in Comment #4 is similar to something I saw in RHEL3. I'll
test it out and post it.
Patch tested and posted to rhkernel-list
I thought I had the problem resolved, but
as soon as I say somthing, it locked up again. When I
returned from lunch, the caps lock and scroll lock lights
were flashing and the machine had locked up after running
for ~2 days with no issues!
Interestingly, the machine locked up over the weekend with
BIOS version 1.01. It did run for about 1.5 days before
it locked up, and as before the scroll lock and caps lock
lights on the keyboard were flashing.
Ron is my user whom I am trying to assist with this issue.
Where do we need to go next with this?
I've not done a kernel recompile in years and on linux only added a module that
was already available and that was also years ago.
My user needs resolution as soon as possible.
Thanks for your attention and assistance to date and in the future.
The patch listed in Comment #4 is not part of the 2.6.9-42.ELsmp kernel. It has
been proposed for possible inclusion in a future kernel release.
committed in stream U5 build 42.3. A test kernel with this patch is available
My current kernel is 2.6.9-42.14.ELsmp and BIOS is
1.04.2895. The motherboard is a Tyan Thunder K8WE Model
S2895 running BIOS version 1.04.
The above system still locks up...
I have two machines with IWILL motherboards and dual single
core Opteron 248's that runs fine.
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
QE ack for RHEL4.5.
User email@example.com's account has been closed
Patch is in, looks to have been reported to have resolved at least one customer
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.