Bug 85341 - summit system restarts under heavy graphic load
summit system restarts under heavy graphic load
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 2.1
Classification: Red Hat
Component: XFree86 (Show other bugs)
2.1
i686 Linux
high Severity high
: ---
: ---
Assigned To: John Dennis
David Lawrence
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-02-28 06:25 EST by Ernst-Heinrich Klaas
Modified: 2007-11-30 17:06 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2003-03-19 12:01:07 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Bug85341Info.tgz (201.14 KB, text/plain)
2003-02-28 09:09 EST, Ernst-Heinrich Klaas
no flags Details
GrepXFree86.txt (100.59 KB, text/plain)
2003-02-28 09:50 EST, Ernst-Heinrich Klaas
no flags Details
lsmod.txt (582 bytes, text/plain)
2003-02-28 09:51 EST, Ernst-Heinrich Klaas
no flags Details
lspci.txt (1.17 KB, text/plain)
2003-02-28 09:51 EST, Ernst-Heinrich Klaas
no flags Details
messages (573.59 KB, text/plain)
2003-02-28 09:52 EST, Ernst-Heinrich Klaas
no flags Details
XF86Config-4 (2.14 KB, text/plain)
2003-02-28 09:53 EST, Ernst-Heinrich Klaas
no flags Details
XFree86.0.log (24.99 KB, text/plain)
2003-02-28 09:53 EST, Ernst-Heinrich Klaas
no flags Details
messages.txt (573.59 KB, text/plain)
2003-02-28 10:05 EST, Ernst-Heinrich Klaas
no flags Details
XF86Config-4.txt (2.14 KB, text/plain)
2003-02-28 10:07 EST, Ernst-Heinrich Klaas
no flags Details
XFree86.0.log.txt (24.99 KB, text/plain)
2003-02-28 10:08 EST, Ernst-Heinrich Klaas
no flags Details
GrepXFree86.txt (656 bytes, text/plain)
2003-02-28 10:18 EST, Ernst-Heinrich Klaas
no flags Details

  None (edit)
Description Ernst-Heinrich Klaas 2003-02-28 06:25:22 EST
Description of problem:
We have installed Red Hat AS 2.1 on a Fujitsu Siemens Computers PRIMERGY T850 
with 8 CPU's and 16 GB RAM. The PRIMERGY T850 is the same hardware like the 
IBM xSeries 440.
After installing RH AS2.1 out of the box we have installed the kernel 2.4.9-
e.12:
rpm -Uvh kernel-doc-2.4.9-e.12.i386.rpm
rpm -Uvh kernel-headers-2.4.9-e.12.i386.rpm
rpm -Uvh kernel-source-2.4.9-e.12.i386.rpm
rpm -ivh kernel-summit-2.4.9-e.12.i686.rpm

Additionally we add the boot option "notsc".

The systems restarts approximate 58 minutes after starting the
rhr-ddX test. This test is part of the "Red Hat Ready Certification Testing 
Suite".
(rhr-core-1.8-1.noarch.rpm, rhr-auto-1.8-1.noarch.rpm, rhr-interactive-1.8-
1.noarch.rpm)

Version-Release number of selected component (if applicable)
XFree86 4.1.0
savage_drv.o  1.1.20t


How reproducible:
always

Steps to Reproduce:
1. Install all "rhr-xxx-1.8-1.noarch.rpm" RPMs
2. Start "rhr-ddX"
3.
    
Actual results:
System restart

Expected results:
No restart

Additional info:
After the restart we found following information inside
"event log" and "/var/log/messages":  

Event Log:

X SERVPROC 02/27/03 17:10:45 Resetting system due to an unrecoverable error
X SERVPROC 02/27/03 17:10:45 Upper CEC Machine Check, TW_MCKC: [20][00][00][00]
[00][00][00][00]
X SERVPROC 02/27/03 17:10:45 Lower CEC Machine Check, TW_MCKC: [20][00][00][00]
[00][00][00][00]
X SERVPROC 02/27/03 17:10:42 Upper CEC Machine Check, CYC_MCK2: [00][01][00][00]
[00][08][00][00]
X SERVPROC 02/27/03 17:10:33 Lower CEC Machine Check, CYC_MCK2: [00][00][80][00]
[00][08][00][00]
(Severities: X=Error)


/var/log/messages:

Feb 27 15:22:23 pdb0362c rhr-Config:  succeeded
Feb 27 15:22:23 pdb0362c rhr-Zresults:  succeeded
Feb 27 15:22:23 pdb0362c rhr-Config:  succeeded
Feb 27 15:22:30 pdb0362c rhr-INFO:  succeeded
Feb 27 15:22:30 pdb0362c last message repeated 4 times
Feb 27 15:22:31 pdb0362c kernel: hda: ATAPI 24X DVD-ROM drive, 512kB Cache, UDMA
(33)
Feb 27 15:22:31 pdb0362c kernel: Uniform CD-ROM driver Revision: 3.12
Feb 27 15:31:29 pdb0362c kernel: loop: loaded (max 8 devices)
Feb 27 15:39:12 pdb0362c login(pam_unix)[1211]: session opened for user root by 
LOGIN(uid=0)
Feb 27 15:39:12 pdb0362c  -- root[1211]: ROOT LOGIN ON tty2
Feb 27 16:14:31 pdb0362c rhr-cdromdata:  succeeded
Feb 27 16:14:35 pdb0362c modprobe: modprobe: Can't locate module char-major-81
Feb 27 16:33:10 pdb0362c modprobe: modprobe: Can't locate module char-major-81
Feb 27 16:52:53 pdb0362c modprobe: modprobe: Can't locate module char-major-81
Feb 27 17:11:08 pdb0362c syslogd 1.4.1: restart.

The Red Hat HCL entry for the identical IBM xSeries 440 include no hints for
"non-native" drivers ( specially for the graphic driver "savage_drv.o" )
Comment 1 Mike A. Harris 2003-02-28 06:34:46 EST
Can you please attach the /var/log/messages file from when this occurs,
as well as the complete output of:

lspci
lsmod

Also attach the X server log and config file please.
Comment 2 Mike A. Harris 2003-02-28 06:36:39 EST
Also, what exact RPM release of XFree86 are you using?

rpm -qa | grep XFree86

Thanks in advance.
Comment 3 Ernst-Heinrich Klaas 2003-02-28 09:09:53 EST
Created attachment 90411 [details]
Bug85341Info.tgz

Attachment contains following files :
GrepXFree86.txt, lsmod.txt, lspci.txt, messages,
XF86Config-4, XFree86.0.log
Comment 4 Mike A. Harris 2003-02-28 09:41:01 EST
Please make all file attachments separate individual text/plain file
attachments so that they can be easily viewed by clicking on them in the
web browser.

It is a lot of extra effort to download a tarball, decompress it, then
try keep track of the various files throughout the life of a bug.  They
get lost, and require redownloading each time to view them.  When multiplied
times several bug reports, it quickly becomes very unmanageable.
Comment 5 Ernst-Heinrich Klaas 2003-02-28 09:50:43 EST
Created attachment 90413 [details]
GrepXFree86.txt
Comment 6 Ernst-Heinrich Klaas 2003-02-28 09:51:11 EST
Created attachment 90414 [details]
lsmod.txt
Comment 7 Ernst-Heinrich Klaas 2003-02-28 09:51:59 EST
Created attachment 90415 [details]
lspci.txt
Comment 8 Ernst-Heinrich Klaas 2003-02-28 09:52:40 EST
Created attachment 90416 [details]
messages
Comment 9 Ernst-Heinrich Klaas 2003-02-28 09:53:26 EST
Created attachment 90417 [details]
XF86Config-4
Comment 10 Ernst-Heinrich Klaas 2003-02-28 09:53:56 EST
Created attachment 90418 [details]
XFree86.0.log
Comment 11 Ernst-Heinrich Klaas 2003-02-28 10:05:47 EST
Created attachment 90419 [details]
messages.txt
Comment 12 Ernst-Heinrich Klaas 2003-02-28 10:07:48 EST
Created attachment 90420 [details]
XF86Config-4.txt
Comment 13 Ernst-Heinrich Klaas 2003-02-28 10:08:33 EST
Created attachment 90421 [details]
XFree86.0.log.txt
Comment 14 Ernst-Heinrich Klaas 2003-02-28 10:18:15 EST
Created attachment 90422 [details]
GrepXFree86.txt

Some attachements did miss the suffix ".txt".
Corrected now !!
(Sorry for the inconvenience.)
Comment 15 Ernst-Heinrich Klaas 2003-03-05 09:36:24 EST
The same problem occured also with XFree86 4.1.0-29.
Comment 16 John Dennis 2003-03-05 11:09:22 EST
It appears as though there was a machine check. I'm not familar with this
system. Are you able to get a machine check dump from the bios?

It would be good to isolate the problem. Can you replace the graphics card with
something other than an S3, such as an ATI or Nvidia card and rerun the test? If
it does not fail with another card/driver then we know this is driver specific
and helps us to focus.

The current AS2.1 release of XFree86 is 4.1.0-44. There is no reason for me to
believe the changes between the release 25 you are running and the current 44
release affect this issue, but on the other hand thats quite a few revisions
apart, its probably worthwhile using the latest released package.
Comment 17 Ernst-Heinrich Klaas 2003-03-06 09:35:30 EST
We have done a few tests now with XFree86 4.1.0-44 and it looks pretty good.
But we have to do some more tests before we can close this bugzilla report.
Comment 18 Ernst-Heinrich Klaas 2003-03-19 07:03:01 EST
The system works fine after using XFree86 4.1.0-44 ( download: 
ftp://updates.redhat.com/... ) and all screensavers disabled.
(see also : "Servers - LINUX hangs with XScreenSaver v3.33" under
 http://www-1.ibm.com/support/ )
Comment 19 John Dennis 2003-03-19 11:42:15 EST
Mike - why did you reopen this?
Comment 20 Mike A. Harris 2003-03-19 12:01:07 EST
I've re-opened this as it was incorrectly closed as resolved in RAWHIDE.

XFree86 4.3.0 is what is in rawhide (internally anyway), and we don't really
know what the problem was specifically.  It definitely hasn't been tested
with rawhide though, so that resolution is invalid.

The problem is considered resolved above with 4.1.0-44, so it should be
resolved as ERRATA.  I'm wondering though if the problem still does occur
with screensavers enabled.  Disabling the screensavers would just be a
workaround in this case.

The future erratum may contain a new Savage driver update which is known
to solve many driver related issues.
Comment 21 Nils Philippsen 2003-03-24 11:22:05 EST
On my machine (not Advanced Server though, different graphics card) I sometimes
(with older XFre86 versions) had 3D stability issues, so this might be a
possible explanation why disabling screensavers worked around the problem. Just
a thought.

Note You need to log in before you can comment on or make changes to this bug.