From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20030225
Description of problem:
Installed RHL9 with no problems and ran it for a couple of weeks with
no problem. Upgraded kernel to 2.4.20-9 no problem, and ran for couple of weeks
with no problem. Upgraded kernel last Thursday to 2.4.20-13.9 and system seemed
to hang every so often. Telnetting into system I could see X running at 100%
cpu. Ran previous kernel (2.4.20-9) and same thing happens on that now too. I
use KDE (upgraded via up2date). Example:
00:32:17 up 1 day, 7:21, 0 users, load average: 1.41, 1.21, 0.93
66 processes: 64 sleeping, 2 running, 0 zombie, 0 stopped
CPU states: 34.4% user 0.4% system 0.0% nice 0.0% iowait 65.0% idle
Mem: 513596k av, 489628k used, 23968k free, 0k shrd, 107512k buff
279032k actv, 0k in_d, 3212k in_c
Swap: 1044208k av, 2532k used, 1041676k free 199732k cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
22486 root 25 0 267M 10M 4052 R 99.5 2.1 22:01 0 X
Linux jhorne 2.4.20-9custom #4 Fri May 2 14:17:52 BST 2003 i686 i686 i386 GNU/Linux
'custom' is rebuilt kernel with NTFS module and correct (pentium 4) cpu set.
Problem exists in both kernels with unmodified kernels.
No errors seen in dmesg, messages or xfree86.0.log.
Graphics card is nvidia (ugh!) - problem exists with both redhat 'nv' and
nvidia's own 'nvidia' drivers. Disabled dri and glx - no change. Mouse is ps/2;
standard 102-key keyboard.
This is my work PC and I have already lost one day (Friday) just trying to get
the thing stable. Runs okay in command-line (init 3) mode, but no good as a
PC is an RM accelerator 2GHz P4 Xeon.
Version-Release number of selected component (if applicable):
2.4.20-13.9 and now 2.4.20-9
Steps to Reproduce:
1. Just reboot :-(
Actual Results: X runs at 100% cpu utilisation.
Expected Results: X should run at low cpu utilisation.
Created attachment 91763 [details]
strace of X process
PC was running at 100%cpu. Note the 'top' command 'size' column is up to about
270MB of memory as well! This is usually a low value - even for X. I ran
stracve on the X process to see why it was cpu-bound, the attachment shows a
tight loop of some sort involving the ALRM signal.
From the debian mailing list (via google) I found a reference to this:
I don't understand all the techie stuff but I may see if I can rebuild xfree
without the 'SMART_SCHEDULE' to see if that gets around the problem for me.
Note - other messages from others who have had this problem seemed to indicate
that it is not a kernel problem nor an nvidia driver problem but an xfree server
problem which gets triggered by something in the nvidia drivers.
The 'XFree86' X server has an undocumented option it seems '-dumbSched' (from
the src rpm in /usr/src/redhat/SOURCES/xfree*/xc/programs/Xserver/utils.c. check
the path though!). This seems to disable the SMART_SCHEDULE stuff.
Tried using this but with no luck. By default it seems gdm is used and I cant
see how/where it starts the X server. Tried setting to use xdm server (putting
DISPLAYMANAGER in /etc/sysconfig/desktop) but X startup fails - error log says:
May 1 14:56:43 jhorne gdm: Failed to start X server several times in a
short time period; disabling display :0
Sigh. I'm now using the on-board i810 graphics port, and have been allowed to
put in an order for an ATI card (I've used these at home with no porblems at all).
Moved to 'XFree86' component (from 'kernel') since this is an XFree issue and
not directly the kernel. I hope this is okay.
I note from other bugs reports that since I am (was) using an nvidia card, the
problem is complicated by being unable to support the nvidia drivers.
I can't see how a kernel upgrade alone, will cause *only* the "nv" and "nvidia"
driver to make X go to 100% CPU usage.
If your system has had the nvidia binary modules loaded at all since boot,
it is unsupported. If you can reboot your system and never load the nvidia
kernel module at all, and reproduce this using only the "nv" driver as shipped
with Red Hat Linux, then please report this to http://bugs.xfree86.org and the
Nvidia "nv" driver maintainer (who works at Nvidia) will likely investigate
Red Hat has no knowledge of the operation of Nvidia hardware, and no access
to the documentation of that hardware.
*** This bug has been marked as a duplicate of 73733 ***
Nvidia's installer now lets you recompile their module, so the REDHAT practice
of calling a 9.0 bug the same thing as the 7.3 bug is MOOT.
Please try to address the bug and/or contact Nvidia. My rh9 system with the
same driver and card, same kernel version, just a single cpu and not smp is
FINE at home.....so I think that the issue is with RedHat software not Nvidia
Everything is current.
For the person who tried to help and suggested I needed kdeartwork, I know I
need that. I appreciate the help.
rpm -qui kdeartwork gets a full listing of the kdeartwork build's stats, etc.
It is installed.
I just wish that RedHat would put some effort into this and not just blaim
Mandrake doesn't have this problem...it makes me wonder if the hack in the
kernel redhat has added are "good."
why would we contact NVidia to fix a bug YOU encounter with THEIR binary only
kernel module ?
Changed to 'CLOSED' state since 'RESOLVED' has been deprecated.