Red Hat Bugzilla – Bug 85341
summit system restarts under heavy graphic load
Last modified: 2007-11-30 17:06:52 EST
Description of problem:
We have installed Red Hat AS 2.1 on a Fujitsu Siemens Computers PRIMERGY T850
with 8 CPU's and 16 GB RAM. The PRIMERGY T850 is the same hardware like the
IBM xSeries 440.
After installing RH AS2.1 out of the box we have installed the kernel 2.4.9-
rpm -Uvh kernel-doc-2.4.9-e.12.i386.rpm
rpm -Uvh kernel-headers-2.4.9-e.12.i386.rpm
rpm -Uvh kernel-source-2.4.9-e.12.i386.rpm
rpm -ivh kernel-summit-2.4.9-e.12.i686.rpm
Additionally we add the boot option "notsc".
The systems restarts approximate 58 minutes after starting the
rhr-ddX test. This test is part of the "Red Hat Ready Certification Testing
(rhr-core-1.8-1.noarch.rpm, rhr-auto-1.8-1.noarch.rpm, rhr-interactive-1.8-
Version-Release number of selected component (if applicable)
Steps to Reproduce:
1. Install all "rhr-xxx-1.8-1.noarch.rpm" RPMs
2. Start "rhr-ddX"
After the restart we found following information inside
"event log" and "/var/log/messages":
X SERVPROC 02/27/03 17:10:45 Resetting system due to an unrecoverable error
X SERVPROC 02/27/03 17:10:45 Upper CEC Machine Check, TW_MCKC: 
X SERVPROC 02/27/03 17:10:45 Lower CEC Machine Check, TW_MCKC: 
X SERVPROC 02/27/03 17:10:42 Upper CEC Machine Check, CYC_MCK2: 
X SERVPROC 02/27/03 17:10:33 Lower CEC Machine Check, CYC_MCK2: 
Feb 27 15:22:23 pdb0362c rhr-Config: succeeded
Feb 27 15:22:23 pdb0362c rhr-Zresults: succeeded
Feb 27 15:22:23 pdb0362c rhr-Config: succeeded
Feb 27 15:22:30 pdb0362c rhr-INFO: succeeded
Feb 27 15:22:30 pdb0362c last message repeated 4 times
Feb 27 15:22:31 pdb0362c kernel: hda: ATAPI 24X DVD-ROM drive, 512kB Cache, UDMA
Feb 27 15:22:31 pdb0362c kernel: Uniform CD-ROM driver Revision: 3.12
Feb 27 15:31:29 pdb0362c kernel: loop: loaded (max 8 devices)
Feb 27 15:39:12 pdb0362c login(pam_unix): session opened for user root by
Feb 27 15:39:12 pdb0362c -- root: ROOT LOGIN ON tty2
Feb 27 16:14:31 pdb0362c rhr-cdromdata: succeeded
Feb 27 16:14:35 pdb0362c modprobe: modprobe: Can't locate module char-major-81
Feb 27 16:33:10 pdb0362c modprobe: modprobe: Can't locate module char-major-81
Feb 27 16:52:53 pdb0362c modprobe: modprobe: Can't locate module char-major-81
Feb 27 17:11:08 pdb0362c syslogd 1.4.1: restart.
The Red Hat HCL entry for the identical IBM xSeries 440 include no hints for
"non-native" drivers ( specially for the graphic driver "savage_drv.o" )
Can you please attach the /var/log/messages file from when this occurs,
as well as the complete output of:
Also attach the X server log and config file please.
Also, what exact RPM release of XFree86 are you using?
rpm -qa | grep XFree86
Thanks in advance.
Created attachment 90411 [details]
Attachment contains following files :
GrepXFree86.txt, lsmod.txt, lspci.txt, messages,
Please make all file attachments separate individual text/plain file
attachments so that they can be easily viewed by clicking on them in the
It is a lot of extra effort to download a tarball, decompress it, then
try keep track of the various files throughout the life of a bug. They
get lost, and require redownloading each time to view them. When multiplied
times several bug reports, it quickly becomes very unmanageable.
Created attachment 90413 [details]
Created attachment 90414 [details]
Created attachment 90415 [details]
Created attachment 90416 [details]
Created attachment 90417 [details]
Created attachment 90418 [details]
Created attachment 90419 [details]
Created attachment 90420 [details]
Created attachment 90421 [details]
Created attachment 90422 [details]
Some attachements did miss the suffix ".txt".
Corrected now !!
(Sorry for the inconvenience.)
The same problem occured also with XFree86 4.1.0-29.
It appears as though there was a machine check. I'm not familar with this
system. Are you able to get a machine check dump from the bios?
It would be good to isolate the problem. Can you replace the graphics card with
something other than an S3, such as an ATI or Nvidia card and rerun the test? If
it does not fail with another card/driver then we know this is driver specific
and helps us to focus.
The current AS2.1 release of XFree86 is 4.1.0-44. There is no reason for me to
believe the changes between the release 25 you are running and the current 44
release affect this issue, but on the other hand thats quite a few revisions
apart, its probably worthwhile using the latest released package.
We have done a few tests now with XFree86 4.1.0-44 and it looks pretty good.
But we have to do some more tests before we can close this bugzilla report.
The system works fine after using XFree86 4.1.0-44 ( download:
ftp://updates.redhat.com/... ) and all screensavers disabled.
(see also : "Servers - LINUX hangs with XScreenSaver v3.33" under
Mike - why did you reopen this?
I've re-opened this as it was incorrectly closed as resolved in RAWHIDE.
XFree86 4.3.0 is what is in rawhide (internally anyway), and we don't really
know what the problem was specifically. It definitely hasn't been tested
with rawhide though, so that resolution is invalid.
The problem is considered resolved above with 4.1.0-44, so it should be
resolved as ERRATA. I'm wondering though if the problem still does occur
with screensavers enabled. Disabling the screensavers would just be a
workaround in this case.
The future erratum may contain a new Savage driver update which is known
to solve many driver related issues.
On my machine (not Advanced Server though, different graphics card) I sometimes
(with older XFre86 versions) had 3D stability issues, so this might be a
possible explanation why disabling screensavers worked around the problem. Just