Bug 101547

Summary: xorg-x11 x-server crashes while running "x11perf"
Product: [Fedora] Fedora Reporter: Peter van Egdom <p.van.egdom>
Component: xorg-x11Assignee: X/OpenGL Maintenance List <xgl-maint>
Status: CLOSED WONTFIX QA Contact: David Lawrence <dkl>
Severity: high Docs Contact:
Priority: medium    
Version: 3CC: davidf, gajownik, michel
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-04-11 01:59:14 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 136451, 143641    
Attachments:
Description Flags
XF86Config file
none
XFree86.0.log file
none
/var/log/message file
none
XFree86.0.log from xorg-x11-0.6.6-0.2004_03_30.1 on my system none

Description Peter van Egdom 2003-08-03 15:15:47 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703

Description of problem:

While I was (stress-)testing the X performance of my machine (standard Red Hat
Linux release 9.0.93 (Severn)) with the command:

   "x11perf -v1.3 -rop GXcopy GXxor -repeat 2 -time 1 -all"

the message 'Out of Memory: Killed process 2656 (kdeinit).' appeared after this
command was completed in the dmesg output. When this happened (I was in KDE) my
KDE icons dissapeared.

I then logged out of KDE and while X was attempting to start GDM I heard a
couple of 'clicks' coming out of my monitor followed by a error that the system
could not start X.

This machine has a i815 chipset and uses the i810 module (vendor="The XFree86
Project").

I'll attach the following files to this bugzilla:

 /var/log/XFree86.0.log
 /etc/X11/XF86Config
 /var/log/messages

Being a bit curious I also tried it on another machine with also a standard Red
Hat Linux release 9.0.93 (Severn). This machine has a GeForce4 Ti 4600 and uses
the module "nv" (vendor="The XFree86 Project").

_Exactly_ the same problem occured on that machine.
(Aug  3 14:02:54 powermate kernel: Out of Memory: Killed process 3245 (kdeinit)


Version-Release number of selected component (if applicable):
XFree86-4.3.0-17

How reproducible:
Always

Steps to Reproduce:
1. Start a machine with a fresh Red Hat Linux release 9.0.93 (Severn).
2. Start KDE.
3. Enter the command "x11perf -v1.3 -rop GXcopy GXxor -repeat 2 -time 1 -all"
   and wait an hour or so.
4. Logout KDE.


Actual Results:  
1.
2.
3.- Out of Memory: Killed process 2656 (kdeinit) in syslog.
  - KDE loses it's icons.
4 - Severn tries to (re)start GDM, but fails.

(reproducable on 2 different machines)

Expected Results:  
Not this.

Additional info:
It looks like a memory leak somewhere.

Comment 1 Peter van Egdom 2003-08-03 15:18:57 UTC
Created attachment 93363 [details]
XF86Config file

Comment 2 Peter van Egdom 2003-08-03 15:19:30 UTC
Created attachment 93364 [details]
XFree86.0.log file

Comment 3 Peter van Egdom 2003-08-03 15:20:01 UTC
Created attachment 93365 [details]
/var/log/message file

Comment 4 Mike A. Harris 2003-08-03 21:43:25 UTC
Start X without KDE using just a single xterm and perform the same test please.
You can do this by creating a ~/.xinitrc file containing one line "xterm" then
running "startx".  Execute the x11perf commandline above, and let the test run.

Please report back if the same behaviour is observed during this test.

Comment 5 Peter van Egdom 2003-08-07 21:04:09 UTC
This time I tried it without KDE (by creating a .xinitrc file with only the
"xterm" command and, from runlevel 3, using "startx"). Under these conditions
I'm unable to reproduce the described symptoms.

I also tried reproducing this bugzilla on a PC at work. In that environment
(KDE) the described problem also popped up - the same symptoms as described for
my home machine.

When running "x11perf -v1.3 -rop GXcopy GXxor -repeat 2 -time 1 -all" on Gnome,
there will be a similar message : 

  'powermate kernel: Out of Memory: Killed process 3191 (xscreensaver)'

Now I really don't know on which component this bug belongs.

Comment 6 Peter van Egdom 2003-09-28 15:23:20 UTC
Strange, I cannot reproduce this problem anymore with (a clean install of)
Fedora Core Test 2 - release 0.94.

Two of my machines on which I could 100% reproduce this bug (?) with the
previous public beta (Red Hat Linux release 9.0.93) seem to handle the command
"x11perf -v1.3 -rop GXcopy GXxor -repeat 2 -time 1 -all" much better.

I will continue to investigate this bug throughout the Fedora Core test
releases, but for now, I'll close this bug with resolution "RAWHIDE".

If I manage to reproduce this bug at a future date I'll reopen this bugzilla.

Comment 7 Peter van Egdom 2004-04-07 15:22:39 UTC
Reopening bug.

Using the command "x11perf -v1.3 -rop GXcopy GXxor -repeat 2 -time 1
-all" on a fresh Fedora Core release 1.91 (FC2) re-introduces this bug
on one of my machines (I'll attach the XFree86.0.log).

Using the command above with the standard
"xorg-x11-0.0.6.6-0.0.2004_03_11.9" packages crashes the X server
within 1 minute.

I can still connect to my Fedora Core release 1.91 (FC2) machine from
my iBook and give it a reboot. Remotely switching from the current
runlevel to runlevel 1, and then back to runlevel 5 does not help (the
'screen' on the moment it crashed even stays onscreen even while there
is no X server running).

Removing all old packages of xorg-x11
(xorg-x11-*-0.0.6.6-0.0.2004_03_11.9) and updating to the newer
"xorg-x11-*-0.6.6-0.2004_03_30.1" packages does something strange
though; instead of crashing within 1 minute it now takes 3 minutes to
crash with the x11perf command.

To me, this looks something like a memory leak reappeared in the
xorg-x11 package. I could not reproduce this problem at the time on
the same system running Fedora Core 1 (with the XFree86 packages).

I tried the "xrestop" program to see if I could see anything special,
but, alas, that doesn't give any useful information.

I can't find anything strange in the syslog or in the output of the
command "dmesg" (I watched the output of dmesg on the remote ssh
session to the machine on to the moment when the X server crashed).

Changing :
 - Product from "Red Hat Linux Beta" to "Fedora Core".
 - Component from "XFree86" to "xorg-x11".

Comment 8 Peter van Egdom 2004-04-07 15:28:45 UTC
Created attachment 99189 [details]
XFree86.0.log from xorg-x11-0.6.6-0.2004_03_30.1 on my system

Comment 9 Peter van Egdom 2004-06-03 19:38:53 UTC
Described problem still occurs with package "xorg-x11-6.7.0-2".

This has been tested with a fresh installation of Fedora Core release
2 (Tettnang) in both KDE and the 'failsafe' option from GDM (e.g. no
running window manager / desktop environment).

Reassigning from version "test2" to "2".

Comment 10 Need Real Name 2004-07-20 14:13:29 UTC
I made an interesting discovery that could embody the clue to a
solution of the problem I also have. For testing reasons I just
installed the nvidia driver and now it seems to work absolutely fine.
I ran "xperf ..." about four to five minutes and everything went well.
I have a Geforce 2 GTS and an Elitegroup K7S5A with the SIS-735
chipset. Maybe it can help you.
best regards, Christophorus

Comment 11 Need Real Name 2004-07-21 08:49:35 UTC
Sorry, was not the completely satisfying solution I was searching for.
It just took a bit longer to crash. About two hours of normal working..

Comment 13 Peter van Egdom 2004-11-14 08:33:06 UTC
Described problem still occurs with package "xorg-x11-6.8.1-12" on a
clean installed version of Fedora Core release 3.

Reassigning from version "fc2" to "fc3".

Comment 16 Mike A. Harris 2005-02-02 20:34:57 UTC
If you log in remotely to this machine via ssh, and run "top",
sorting processes by memory usage (press M), and increasing the
rate of display (press s, then 1, then enter), and then run the
x11perf command, you should be able to see which process is
increasing in size.  It might be a memory leak in the X server,
but it might be a memory leak in something else instead.

Once anything leaks enough memory, it will eventually start
consuming swap space until all memory and swap are full.  At this
point the kernel's OOM (out of memory) reaper will start killing
processes.  The kernel may end up killing several processes before
it gets the right one, as the OOM logic has to make some
assumptions that don't always turn out to be true.

If you can visually determine with top, which process(es) is/are
increasing in size dramatically, this will provide the best idea
as to where the problem lay.  If it's just a single application
increasing in size, then that application probably has a memory
leak.  If the X server increases in size, it could be a memory
leak in the X server, or it could be an X resource leak in an
X client.

The odd part about all of this, is that you indicated above that
it only occurs in KDE, but not with a minimal X setup with just
xterm.  Very strange.  ;/

Please add the results of your findings.

Thanks in advance.


Setting status to "NEEDINFO", awaiting results of testing problem
while analyzing with 'top'.

Comment 17 Michel Dänzer 2005-02-14 20:38:46 UTC
Can you reproduce this by running just x11perf -umove? I've seen that cause
xscreensaver memory consumption go through the roof. My attempts to track down
the leak with valgrind have been inconclusive, but _XReply/_XEnq in libX11 keep
popping up, so I suspect that xscreensaver (and a corresponding component of KDE
that's also interested in window manipulations?) simply can't keep up with the
enormous amount of events caused by x11perf, so its event queue keeps growing.
x11perf was really designed to run on a 'naked' server I think...

Comment 18 Mike A. Harris 2005-04-11 01:59:14 UTC
This issue seems to not be something that happens under normal server
operation, and seems to only occur when using x11perf with a full
blown X11 desktop startup, which as indicated in the last comment, is
not how x11perf was originally designed to run.

This issue seems to be along the lines of a UNIX shell fork bomb, simply
by saturating the machine, you can cause a problem.  This issue very
much seems to be of a similar nature.  Just as the solution to end users
executing form bombs in their shell is "do not do that", I think the same
solution applies here too, unless someone can provide evidence of a real
world problem that can be caused by a normal application.

I would recommend filing a bug report in the upstream X.Org X11 bugzilla
for this, to see what upstream developers think about the issue.

Closing this as WONTFIX for now however, as there doesn't seem to be
any real world problems caused by this.