Bug 654280 - Xorg very slow after upgrading to Fedora 14 (with 2+ nvidia cards)
Summary: Xorg very slow after upgrading to Fedora 14 (with 2+ nvidia cards)
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-nouveau
Version: 14
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ---
Assignee: Ben Skeggs
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On: 645940
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-11-17 12:12 UTC by Thomas Spear
Modified: 2018-04-11 17:18 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-14 22:36:31 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Xorg log with segfault (23.37 KB, text/plain)
2010-11-17 12:12 UTC, Thomas Spear
no flags Details
Xorg log after crash when X started up again, just in case (22.01 KB, text/plain)
2010-11-17 12:13 UTC, Thomas Spear
no flags Details
/var/log/messages (924.72 KB, text/plain)
2010-11-17 12:15 UTC, Thomas Spear
no flags Details
dmesg log (53.90 KB, text/plain)
2010-11-17 12:16 UTC, Thomas Spear
no flags Details
sar output (from sysstat) to indicate cpu usage over time. (642 bytes, text/plain)
2010-11-17 12:19 UTC, Thomas Spear
no flags Details
boot.log and messages with 2 cards (187.22 KB, application/zip)
2010-11-19 08:13 UTC, Thomas Spear
no flags Details
boot.log and messages with 3 cards (46.26 KB, application/zip)
2010-11-19 08:15 UTC, Thomas Spear
no flags Details
xorg.conf showing multi-card setup (838 bytes, text/plain)
2010-11-19 10:06 UTC, Ben Skeggs
no flags Details
boot.log-2cards-udev_debug-enabled from the archive (93.33 KB, text/plain)
2010-11-19 21:19 UTC, Matěj Cepl
no flags Details
messages-2cards-udev_debug-enabled from the archive (3.36 MB, text/plain)
2010-11-19 21:19 UTC, Matěj Cepl
no flags Details
boot.log-3cards-udev_debug-enabled from the archive (93.59 KB, text/plain)
2010-11-19 21:20 UTC, Matěj Cepl
no flags Details
messages-3cards-udev_debug-enabled from the archive (372.83 KB, text/plain)
2010-11-19 21:20 UTC, Matěj Cepl
no flags Details

Description Thomas Spear 2010-11-17 12:12:24 UTC
Created attachment 461048 [details]
Xorg log with segfault

Description of problem:
Xorg crashed after being very slow for over an hour after I upgraded from Fedora 13 x86_64 to Fedora 14 x86_64.

I will attach the Xorg.0.log and Xorg.0.log.old (.old shows the segfault)

I did a yum update after installing Fedora 14 and then rebooted to verify slowness was not already fixed. It is NOT fixed.

Version-Release number of selected component (if applicable):
1.9.1

How reproducible:
100%

Steps to Reproduce:
N/A - Just using computer, things are slow, then X crashed when I switched a tab in Firefox.
  
Actual results:
Crash, performance is horrible

Expected results:
No crash, good performance

Additional info:
Running a triple head setup with 3 GeForce 8400GS cards (2 heads per card available). Running akmods-nvidia from rpmfusion. 2 heads are in use on the PCI-E card, one head is in use on one of the PCI cards, and the other card is not using any heads.

Same setup worked fine, including akmods-nvidia, with Fedora 13.

Will test nouveau after I post logs.

I have rdblacklist=nouveau and intel_iommu=off in kernel command line. intel_iommu=off is required in order to not get X to lock up when using multiple cards in this machine, regardless of whether I am using nouveau or nvidia driver. I also have blacklist nouveau in /etc/modprobe.d/blacklist.conf even though it is not necessarily needed with rdblacklist.

When I say machine is slow, I mean switching from one window to another, it takes several seconds for the new window to become the active window. Scrolling multi-line editboxes by click and drag takes several seconds for the drag to register. Scrolling editboxes and combo boxes takes several seconds to begin scrolling and then scrolls very slowly. Even the gnome panel slides onto the screen slowly when I first login.

Text entry into editboxes seems normal speed. Running command line apps like top shows Xorg hitting about 16% of one core re, and Xorg idling around 2-3% of one core.

Logs are coming next.

Comment 1 Thomas Spear 2010-11-17 12:13:40 UTC
Created attachment 461049 [details]
Xorg log after crash when X started up again, just in case

Comment 2 Thomas Spear 2010-11-17 12:15:09 UTC
Created attachment 461050 [details]
/var/log/messages

Comment 3 Thomas Spear 2010-11-17 12:16:29 UTC
Created attachment 461051 [details]
dmesg log

Comment 4 Thomas Spear 2010-11-17 12:19:20 UTC
Created attachment 461052 [details]
sar output (from sysstat) to indicate cpu usage over time.

Comment 5 Thomas Spear 2010-11-17 12:40:05 UTC
Determined now that X crashes whenever konsole is launched. I am using gnome, but have all KDE dependencies for running konsole installed. This worked fine in F13. I will file a separate bug for konsole crashing X. Let's stick to the slowness in this bug.

Comment 6 Thomas Spear 2010-11-18 05:29:49 UTC
Based on recommendation to disable nvidia and use nouveau in bug 654280, I just tried this and it caused the same problem that I reported in bug 645490.

To quote my latest update in that bug:

This is a problem with nouveau and multiple video cards; or maybe nouveau and multiple video cards on different busses, since I run 2 PCI and 1 PCI Express.

If I rdblacklist nouveau and run the nvidia driver with all 3 cards installed,
then the machine boots fine, but runs slow (as reported here). If I
blacklist or remove nvidia, and run nouveau, then the machine hangs at
Starting udev until I remove all but 1 of the add-in video cards. Currently I
am running on the PCI Express card only.

Comment 7 Thomas Spear 2010-11-18 05:31:14 UTC
Correction to the above post: first sentence should read bug 654286.

Also, running nouveau with the single card allows me to use Konsole.

Comment 8 Thomas Spear 2010-11-18 05:41:33 UTC
And also above, bug 645490 should be bug 645940 ...

Comment 9 Matěj Cepl 2010-11-18 14:26:22 UTC
(In reply to comment #6)
> Based on recommendation to disable nvidia and use nouveau in bug 654280, I just
> tried this and it caused the same problem that I reported in bug 645490.

Could we get logs from nouveau? All attached logs are from nvidia binary-only driver.

Also shouldn't we just close this bug as a duplicate of bug 654286? If yes, attach logs there, please.

Thank you

Comment 10 Thomas Spear 2010-11-18 15:05:15 UTC
I cannot provide logs for nouveau if the machine is hard locking when using nouveau with 3 cards... or can I?

Should I chkconfig udev off and boot into init 3 since that is where it is hard locking, or is there some other way to get the logs you need?

Comment 11 Matěj Cepl 2010-11-18 21:57:10 UTC
(In reply to comment #10)
> I cannot provide logs for nouveau if the machine is hard locking when using
> nouveau with 3 cards... or can I?
> 
> Should I chkconfig udev off and boot into init 3 since that is where it is hard
> locking, or is there some other way to get the logs you need?

There are three possible ways of attack:

1) many times hard locking is not that hard; wouldn’t you be able to switch to the text console (via Ctrl-Alt-F2) or access the computer via ssh from another computer?
2) not sure whether switching off udev would help, but you can certainly try (with or without nomodeset) run startx (as normal user) from telinit 3 started system. Hopefully it should crash more gracefully.
3) http://wiki.x.org/wiki/Development/Documentation/ServerDebugging gives more ideas (some of them quite complicated), you may at least some of them.

Of course, does it make any difference if you try just with two cards?

Thanks for your help in making Fedora great!

Comment 12 Thomas Spear 2010-11-18 22:06:11 UTC
(In reply to comment #11)
> 
> There are three possible ways of attack:
> 
> 1) many times hard locking is not that hard; wouldn’t you be able to switch to
> the text console (via Ctrl-Alt-F2) or access the computer via ssh from another
> computer?
No text console when it happens, I did try that. I'll get SSH setup tonight and see if that works.

> 2) not sure whether switching off udev would help, but you can certainly try
> (with or without nomodeset) run startx (as normal user) from telinit 3 started
> system. Hopefully it should crash more gracefully.
Will try also

> 3) http://wiki.x.org/wiki/Development/Documentation/ServerDebugging gives more
> ideas (some of them quite complicated), you may at least some of them.
Will check if 1 and 2 fail

> Of course, does it make any difference if you try just with two cards?
Haven't tested 2 yet as I needed the machine for work last night. I will test that tonight before I try 1 and 2. Most likely...? No it won't as the 2 cards I have removed are PCI while the one I left in is PCI-E. Though I will try with 1 of each, and with 2 PCI as well just to be certain it's not a problem with the different busses.

> Thanks for your help in making Fedora great!

Comment 13 Thomas Spear 2010-11-19 04:25:39 UTC
I have SSH setup and I was just about to reboot to install the cards again but I realized that testing SSH won't work either since it freezes during udev initialization. udev is the first service to start, by default, which means it starts before sshd would.

I will check with a disabled udev to see what sort of info I can get.

Its unfortunate that I have never found a proper way to get the magic sysrq key working, or I would try to use that to just bypass udev initialization.

I digress though, I'll play with it for a little bit and see what I can come up with.

Comment 14 Thomas Spear 2010-11-19 08:10:10 UTC
Bad news, nouveau does not support 3 cards, at least not that I have been able to tell or find through google. Actually, it doesn't even support 2 cards, so I could not get even 2 monitors on 2 cards working to properly test.

I did manage to get nouveau to boot with 2 and 3 cards plugged in. I had to jump through hoops to do it though.

I enabled udev debug logging to syslog and I'm going to attach 2-card and 3-card /var/log/messages and /var/log/boot.log files here in a minute.

When I had udev debug logging enabled, I was able to boot fine although it did take a long time.

When I disabled udev debug logging, I had to hit a key on the keyboard right after the Fedora splash screen came up, and then again right after it tried to come up again in order for udev to start properly. With "rhgb quiet" removed from the kernel command line, I could not get udev to start.

Comment 15 Thomas Spear 2010-11-19 08:13:54 UTC
Created attachment 461485 [details]
boot.log and messages with 2 cards

Comment 16 Thomas Spear 2010-11-19 08:15:02 UTC
Created attachment 461487 [details]
boot.log and messages with 3 cards

Comment 17 Ben Skeggs 2010-11-19 09:32:24 UTC
Nouveau actually should support 2 cards just fine..  I just booted the latest F14 kernel that's in updates-testing (2.6.35.6-48.fc14) with a 9400GT + G210, and it all works just fine (of course, xorg.conf is needed to make X even bother to try with a second screen).

If you're able to get a dmesg log somehow (netconsole perhaps, if necessary?) of a normal boot with nouveau+multicard, that'd be useful to see what's happening.

I'm not sure why this bug is assigned to nouveau either, by all reports you were using the NVIDIA binary driver with this slowness, I suspect something is wrong with it, or elsewhere.

Comment 18 Thomas Spear 2010-11-19 09:54:44 UTC
Can you attach your xorg.conf for me to try 2 cards? I can modify it to make it work with my setup, I've just never been able to figure out a proper combination of settings from scratch or cobbling together what I've found on the internet, so modifying one that works to go with my stuff would help immensely.

If that happens to work, then I'll try with 3.

I agree that this is not a nouveau bug but either an nvidia bug (more likely) or an Xorg-server bug (less likely).

Comment 19 Ben Skeggs 2010-11-19 10:06:45 UTC
Created attachment 461507 [details]
xorg.conf showing multi-card setup

I've attached the xorg.conf from one of my machines unmodified.  It's currently configured for randr and not multi-card (X can't do both unfortunately), to switch to the multicard config by default move the "nouveau-multicard" ServerLayout section to the top of the file.

You'll also probably have to modify the BusID lines to match your system, you can find out what your cards are attached to by looking at lspci output.  For example

07:00.0 VGA compatible controller: nVidia Corporation Device 06c4 (rev a3)

is: BusID "PCI:7:0:0"

Hope that helps!

Comment 20 Matěj Cepl 2010-11-19 21:19:15 UTC
Created attachment 461653 [details]
boot.log-2cards-udev_debug-enabled from the archive

Comment 21 Matěj Cepl 2010-11-19 21:19:27 UTC
Created attachment 461654 [details]
messages-2cards-udev_debug-enabled from the archive

Comment 22 Matěj Cepl 2010-11-19 21:20:03 UTC
Created attachment 461655 [details]
boot.log-3cards-udev_debug-enabled from the archive

Comment 23 Matěj Cepl 2010-11-19 21:20:11 UTC
Created attachment 461656 [details]
messages-3cards-udev_debug-enabled from the archive

Comment 24 Matěj Cepl 2010-11-19 21:22:29 UTC
(In reply to comment #17)
> I'm not sure why this bug is assigned to nouveau either, by all reports you
> were using the NVIDIA binary driver with this slowness, I suspect something is
> wrong with it, or elsewhere.

The last batch of logs is from nouveau, but we are still missing /var/log/Xorg.0.log file from both attempts. Reporter, could you please attach these as well?

Thank you

Comment 25 Thomas Spear 2010-12-01 11:35:03 UTC
Still waiting for an opportunity to test this further. Sorry for the lack of input.

Comment 26 Thomas Spear 2011-11-26 02:59:37 UTC
This can be closed as I no longer have access to the machine that I was using when I was working on this issue.

Comment 27 Thomas Spear 2012-06-14 22:15:07 UTC
Fedora 14 is EOL. Unless someone else has the issue with F15 (until June 2[4,6]) or F16 or later, I think this can be closed.


Note You need to log in before you can comment on or make changes to this bug.