Bug 600465 - Strange performance degradation modality (some relation to udev suspected)
Summary: Strange performance degradation modality (some relation to udev suspected)
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-intel
Version: 12
Hardware: x86_64
OS: Linux
low
high
Target Milestone: ---
Assignee: Adam Jackson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-06-04 18:57 UTC by Bob Glickstein
Modified: 2018-04-11 12:05 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-12-03 14:02:53 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
X server log (184.09 KB, text/plain)
2010-06-04 18:57 UTC, Bob Glickstein
no flags Details
X server log from while the bug is happening (920.61 KB, text/plain)
2010-06-12 13:33 UTC, Bob Glickstein
no flags Details
dmesg output from when the bug is happening (with kernel param drm.debug=0x04) (124.01 KB, text/plain)
2010-06-12 13:34 UTC, Bob Glickstein
no flags Details
/var/log/messages from reboot to when the bug happens (71.42 KB, text/plain)
2010-06-12 13:35 UTC, Bob Glickstein
no flags Details
Very short excerpt of very repetitive "udevadm monitor" output (8.00 KB, text/plain)
2010-06-12 13:36 UTC, Bob Glickstein
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 532011 0 low CLOSED udevd eats 60% CPU cycles 2021-02-22 00:41:40 UTC

Description Bob Glickstein 2010-06-04 18:57:45 UTC
Created attachment 421329 [details]
X server log

Description of problem:

Short version: if CPU usage spikes, it never comes back down

Long version: I can get my machine into a state where CPU usage is high, sometimes by watching a flash video in my browser (perhaps in conjunction with some other activity), sometimes by other means.  Naturally, at such times the load average climbs and the X server becomes less responsive.  Normally the responsiveness should return to normal when the CPU hogs complete (or are killed), but this now reliably fails to happen.  At such times, "top" shows Xorg using > 70%, a couple of udevd's using between 30% and 60% each, and for some reason apcupsd is always near the top of the list too.

Here's the weird part: I can ctrl-alt-f2 into a (textual) virtual terminal and my machine is just as snappy as always.  The CPU usage and load average drop to near zero.  But if I alt-f1 back to the X session, I'm immediately in performance hell again.

I've tried killing off offending processes, but sooner or the performance problem returns, usually without any obvious trigger (unlike when it first appears).  The only solution is a restart, and on one occasion only a cold restart worked!

Version-Release number of selected component (if applicable):

xorg-x11-server-Xorg-1.7.6-4.fc12.x86_64
xorg-x11-drv-intel-2.9.1-1.fc12.x86_64

How reproducible:

Approaching 100%

Comment 1 Bob Glickstein 2010-06-06 15:33:01 UTC
See this thread about udevd sucking up CPU cycles under Ubuntu: http://ubuntuforums.org/showthread.php?t=1361018

At the moment my desktop is totally unusable because of this bug, despite several warm and cold restarts.  Its recurrence has gotten worse since I described it above.

Not sure whether the X server really is the culprit, but it's still the case that switching to a textual virtual terminal makes the system immediately responsive again.

Comment 2 Bob Glickstein 2010-06-07 04:39:34 UTC
Very likely to be a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=528312

Comment 3 Matěj Cepl 2010-06-11 10:31:22 UTC
Well, bug 528312 is temporary ... could you wait a minute or two? Does the system settle down after some time?

Comment 4 Bob Glickstein 2010-06-11 13:30:28 UTC
The system definitely doesn't settle down after waiting even as many as six or ten hours or so, which is about the longest I've observed the symptoms happening continuously.  However, by killing off this or that process -- never the same ones, seemingly -- it's possible to get ordinary performance back for a short time, but after a few minutes the symptoms spontaneously return.

I've confirmed, via "udevadm monitor" that a "udev flood" relating to the Intel DRM system is happening at these times of performance degradation.

It's worth noting that this problem was recurring a few times each day when I first reported it, but in the past few days it has happened less than once per day.  The only difference I can think of between then and now has been the ambient temperature -- it was very hot, but it cooled off.  This weekend is supposed to be very hot again, so we'll see if the problem worsens...

Comment 5 Bob Glickstein 2010-06-11 13:37:33 UTC
BTW, if you read through all the comments of bug 528312 you'll see that though the symptoms are brief for some, they're persistent for others (like me), so "temporary" isn't really a good description of the problem.

Comment 6 Matěj Cepl 2010-06-11 13:44:43 UTC
Thanks for the bug report.  We have reviewed the information you have provided above, and there is some additional information we require that will be helpful in our diagnosis of this issue.

Please add drm.debug=0x04 to the kernel command line, restart computer, wait until the freeze happens, collect the following info via ssh

* your X server config file (/etc/X11/xorg.conf, if available),
* X server log file (/var/log/Xorg.*.log)
* output of the dmesg command, and
* system log (/var/log/messages)

and attach to the bug report as individual uncompressed file attachments using the bugzilla file attachment link above.

We will review this issue again once you've had a chance to attach this information.

Thanks in advance.

Comment 7 Matěj Cepl 2010-06-11 15:00:22 UTC
(In reply to comment #5)
> BTW, if you read through all the comments of bug 528312 you'll see that though
> the symptoms are brief for some, they're persistent for others (like me), so
> "temporary" isn't really a good description of the problem.    

Yes, I had for some time suspicion that we have two bugs in bug 538312 meshed together. Thank you for confirming my suspicion. Could we get aside from information requested in the comment 6 also some reasonable sample of the stderr/stdout output from udevadm when the issue happens, please?

Thank you

Comment 8 Bob Glickstein 2010-06-12 13:33:36 UTC
Created attachment 423508 [details]
X server log from while the bug is happening

Comment 9 Bob Glickstein 2010-06-12 13:34:47 UTC
Created attachment 423509 [details]
dmesg output from when the bug is happening (with kernel param drm.debug=0x04)

Comment 10 Bob Glickstein 2010-06-12 13:35:26 UTC
Created attachment 423510 [details]
/var/log/messages from reboot to when the bug happens

Comment 11 Bob Glickstein 2010-06-12 13:36:07 UTC
Created attachment 423511 [details]
Very short excerpt of very repetitive "udevadm monitor" output

Comment 12 Bob Glickstein 2010-06-18 14:11:31 UTC
This is interesting: last night the problem recurred, but without any apparent udev activity.  The X server was sluggish to the point of paralysis, as usual; switching to a text VT restored responsiveness; but there were no udevd's near the top of the "top" output, and no output from "udevadm monitor."

I've now seen the symptoms in this bug a couple of dozen times (sigh) but AFAIK this is the first time udev has been missing from the picture.

Comment 13 Bob Glickstein 2010-06-25 15:23:14 UTC
Update: I built a kernel with the patch from https://bugzilla.redhat.com/show_bug.cgi?id=528312#c118 (and ran it with the appropriate flags) and it DID NOT HELP.  The symptoms appeared after a warm and a cold reboot.

So I tried disabling the uevent patch in xorg-x11-drv-intel as suggested in https://bugzilla.redhat.com/show_bug.cgi?id=528312#c70 and it DID HELP.  That is, the udev storms continued to happen, but they did not slow X to a crawl.  In fact, a udev storm is happening right now as I type this, but it's monopolizing just one of my four cores, which I can live with for now.

(Cross-posting this update to bug #528312.)

Comment 14 Bug Zapper 2010-11-03 13:34:57 UTC
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 15 Bug Zapper 2010-12-03 14:02:53 UTC
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.