Bug 530169 - nouveau + gdm crashes
Summary: nouveau + gdm crashes
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-nouveau
Version: 12
Hardware: x86_64
OS: Linux
low
high
Target Milestone: ---
Assignee: Ben Skeggs
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: card_GeForce200 NeedsRetesting
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-10-21 18:48 UTC by Matt Domsch
Modified: 2018-04-11 09:58 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-12-04 07:25:08 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
messages.txt (378.79 KB, text/plain)
2009-10-21 18:48 UTC, Matt Domsch
no flags Details
dmesg.txt (118.25 KB, text/plain)
2009-10-22 03:40 UTC, Matt Domsch
no flags Details
objdump (debug) of nouveau at OUT_RINGp() (2.92 KB, text/plain)
2009-10-22 03:41 UTC, Matt Domsch
no flags Details
relevant portion of messages just before crash (5.07 KB, text/plain)
2009-11-05 13:05 UTC, Carl van Tonder
no flags Details
abrt logs (10.05 KB, text/plain)
2009-11-05 20:49 UTC, Matt Domsch
no flags Details
messages (95.43 KB, text/plain)
2009-11-05 20:49 UTC, Matt Domsch
no flags Details
Xorg.0.log (52.62 KB, text/plain)
2009-11-05 20:50 UTC, Matt Domsch
no flags Details
Xorg.0.log - [Quadro NVS 285] (93.16 KB, text/plain)
2009-11-05 21:59 UTC, James Laska
no flags Details

Description Matt Domsch 2009-10-21 18:48:53 UTC
Created attachment 365576 [details]
messages.txt

Description of problem:
Installed Fedora 12 Beta, seen on both kernels
kernel-2.6.31.1-56.fc12.x86_64
kernel-2.6.31.4-88.fc12.x86_64

System does a nice KMS boot. GDM starts, spins for a few seconds, and crashes.  Repeat infinitely.

Video card is:
01:00.0 VGA compatible controller: nVidia Corporation NV44 [Quadro NVS 285] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: nVidia Corporation Device 029d
	Flags: bus master, fast devsel, latency 0, IRQ 16
	Memory at fc000000 (32-bit, non-prefetchable) [size=16M]
	Memory at e0000000 (64-bit, prefetchable) [size=256M]
	Memory at fd000000 (64-bit, non-prefetchable) [size=16M]
	Expansion ROM at fea00000 [disabled] [size=128K]
	Capabilities: [60] Power Management version 2
	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [78] Express Endpoint, MSI 00
	Capabilities: [100] Virtual Channel <?>
	Capabilities: [128] Power Budgeting <?>
	Kernel modules: nouveau, nvidiafb

The interesting part of /var/log/messages starts with:

kernel: [TTM] Failed moving buffer. Proposed placement 0x00070004
kernel: [TTM] Out of aperture space or DRM memory quota.
kernel: =============================================================================
kernel: BUG kmalloc-1024 (Not tainted): Poison overwritten
kernel: -----------------------------------------------------------------------------
kernel:
kernel: INFO: 0xffff8800374d8a78-0xffff8800374d8a7f. First byte 0x0 instead of 0x6b
kernel: INFO: Allocated in nouveau_bo_new+0x69/0x250 [nouveau] age=101 cpu=0 pid=1549
kernel: INFO: Freed in nouveau_bo_del_ttm+0x7f/0x9b [nouveau] age=3 cpu=0 pid=1549
kernel: INFO: Slab 0xffffea00016777c0 objects=29 used=16 fp=0xffff8800374d8890 flags=0x200000000040c3
kernel: INFO: Object 0xffff8800374d8890 @offset=2192 fp=0xffff8800374dd158


I note though, by booting into init 3, and then running startx, I don't see this failure.





Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Matt Domsch 2009-10-22 03:40:41 UTC
Created attachment 365647 [details]
dmesg.txt

I rebuilt the kernel with kmemcheck.  Here's what it found. :-)

[drm] nouveau 0000:01:00.0: Setting dpms mode 3 on vga encoder (output 2)
[drm] nouveau 0000:01:00.0: Setting dpms mode 3 on CRTC 0
WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff880037b3c010)
00000000000000000000000000000000cccccccccccccccccccccccccccccccc
 i i i i i i i i i i i i i i i i u u u u u u u u u u u u u u u u
                                 ^

Modules linked in: nouveau(+) ttm drm_kms_helper drm i2c_algo_bit i2c_core
Pid: 116, comm: work_for_cpu Not tainted 2.6.31.4-88.bz530169.fc12.x86_64 #1 Precision WorkStation 380    
RIP: 0010:[<ffffffffa006d5a8>]  [<ffffffffa006d5a8>] OUT_RINGp+0x58/0xa0 [nouveau]
RSP: 0018:ffff880036885430  EFLAGS: 00010202
RAX: 0000000000000040 RBX: ffff880037b31c00 RCX: 000000000000000c
RDX: 0000000000000010 RSI: ffff880037b3c010 RDI: ffffc900074891a4
RBP: ffff880036885450 R08: 0000000000000000 R09: ffff880034d7dea0
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000010
R13: 0000000000000010 R14: ffff880037b3c000 R15: ffff880034d7de40
FS:  0000000000000000(0000) GS:ffff880004800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: ffff88003ea202e0 CR3: 0000000037c43000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
 [<ffffffffa00bb5ca>] nv04_fbcon_imageblit+0x2aa/0x3e0 [nouveau]
 [<ffffffff8132d59c>] soft_cursor+0x1fc/0x290
 [<ffffffff8132d1fb>] bit_cursor+0x67b/0x6e0
 [<ffffffff813293eb>] fbcon_cursor+0x1fb/0x380
 [<ffffffff813a1d13>] hide_cursor+0x33/0xc0
 [<ffffffff813a49e8>] redraw_screen+0x158/0x250
 [<ffffffff813a7c06>] vc_do_resize+0x416/0x470
 [<ffffffff813a7d0d>] vc_resize+0x2d/0x50
 [<ffffffff8132839d>] fbcon_init+0x2cd/0x560
 [<ffffffff813a2665>] visual_init+0xc5/0x120
  [<ffffffff813a4ee3>] take_over_console+0x53/0x80
 [<ffffffff81327d73>] fbcon_takeover+0x73/0xe0
 [<ffffffff8132bee5>] fbcon_event_notify+0x645/0x6d0
 [<ffffffff815ca5e8>] notifier_call_chain+0x88/0xd0
 [<ffffffff81098f48>] __blocking_notifier_call_chain+0x68/0xa0
 [<ffffffff81098fa4>] blocking_notifier_call_chain+0x24/0x40
 [<ffffffff81319679>] fb_notifier_call_chain+0x29/0x50
 [<ffffffff8131b8a3>] register_framebuffer+0x253/0x360
 [<ffffffffa004812d>] drm_fb_helper_single_fb_probe+0x3cd/0x490 [drm_kms_helper]
 [<ffffffffa008a259>] nouveau_fbcon_probe+0x39/0x80 [nouveau]
 [<ffffffffa004b1fd>] drm_helper_initial_config+0x4d/0x90 [drm_kms_helper]
 [<ffffffffa00648bf>] nouveau_card_init+0x51f/0xd00 [nouveau]
 [<ffffffffa00653a8>] nouveau_load+0x2c8/0x4d0 [nouveau]
 [<ffffffffa001bdea>] drm_get_dev+0x35a/0x580 [drm]
 [<ffffffffa00bc32c>] nouveau_pci_probe+0x23/0x39 [nouveau]
 [<ffffffff812f77f5>] local_pci_probe+0x25/0x40
 [<ffffffff8108c452>] do_work_for_cpu+0x22/0x50
 [<ffffffff8109228e>] kthread+0xbe/0xd0
 [<ffffffff8101446a>] child_rip+0xa/0x20
 [<ffffffffffffffff>] 0xffffffffffffffff

Comment 2 Matt Domsch 2009-10-22 03:41:44 UTC
Created attachment 365648 [details]
objdump (debug) of nouveau at OUT_RINGp()

and the offending code, dumped from objdump.

Comment 3 Matt Domsch 2009-10-22 14:51:38 UTC
booting with 'nomodeset' on the kernel command line avoids the use of this code path and allows the system to boot as expected.  As such, I think it can be dropped from the blocker list, and put on the tracker.

Comment 4 Adam Williamson 2009-10-23 21:49:56 UTC
Ping Ben, this is on the blocker list (decision confirmed at today's blocker review meeting), can you treat this as a priority? Thanks.

Matt: we decided to keep this on the blocker list as KMS is default, and your information provided is excellent so Ben ought to be able to address this for the release.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 5 Ben Skeggs 2009-10-26 22:30:51 UTC
Yeah it's already a priority, similar incarnations of this issue have been reported in multiple places.  Definitely a blocker, the non-KMS path will be dropped upstream completely at some point, and is barely maintained even now.

Comment 6 Ray Strode [halfline] 2009-10-27 03:17:34 UTC
*** Bug 531058 has been marked as a duplicate of this bug. ***

Comment 7 Ray Strode [halfline] 2009-10-31 00:40:06 UTC
*** Bug 531506 has been marked as a duplicate of this bug. ***

Comment 8 Ray Strode [halfline] 2009-11-01 23:31:30 UTC
*** Bug 532213 has been marked as a duplicate of this bug. ***

Comment 9 Ray Strode [halfline] 2009-11-02 12:41:35 UTC
*** Bug 532322 has been marked as a duplicate of this bug. ***

Comment 10 Adam Williamson 2009-11-04 05:31:24 UTC
This should be fixed with these packages:

http://koji.fedoraproject.org/koji/buildinfo?buildID=139686 - kernel 2.6.31.5-116.fc12
http://koji.fedoraproject.org/koji/buildinfo?buildID=139685 - libdrm-2.4.15-3.fc12
http://koji.fedoraproject.org/koji/buildinfo?buildID=139344 - xorg-x11-drv-nouveau-0.0.15-16.20091030git5587f40.fc12

can everyone who posted on this bug, or reported a bug that's been marked as a dupe of it, please test those packages and see how it works?

if any of you can't have an F12 install for testing, we can probably provide a live CD for testing, please let us know. thanks!

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 11 Michael Monreal 2009-11-04 14:14:25 UTC
I filed the dupe #532322. Just tested the new packages on a LiveUSB system and still see the same problem.

Comment 12 Matt Domsch 2009-11-04 15:04:17 UTC
No difference here either.  No better, but no worse.

Comment 13 Ben Skeggs 2009-11-04 21:55:50 UTC
Michael, I'm not convinced your bug *is* the same bug.  There's nothing to indicate that at all.  If you could attach all your gdm logs, hopefully one will show the issue (/var/log/gdm has them).

Matt, exactly the same issue?  I explicitly tried to reproduce that BUG in your original report and can't anymore.  Can I get an updated dmesg log showing the issue with the new kernel, and your /var/log/Xorg.0.log.

Thanks!

Comment 14 Michael Monreal 2009-11-04 22:06:29 UTC
(In reply to comment #13)
> Michael, I'm not convinced your bug *is* the same bug.  There's nothing to
> indicate that at all.  If you could attach all your gdm logs, hopefully one
> will show the issue (/var/log/gdm has them).

I think I messed up btw, I installed the new kernel on a liveusb system I did yesterday morning but looks like the persistant overlay is only applied *after* the kernel boots so I was not testing with the latest kernel :(

Any pointers how to create a live system with the latest kernel? I cannot install to disk atm...

Comment 15 Matt Domsch 2009-11-04 22:57:46 UTC
Ben: you are correct, I no longer see BUGs in dmesg or /var/log/message.  However, GDM continues to crash constantly.  I'll try to gather failure data to send.  too bad abrt tries to capture it, but can't work behind a network proxy. :-(

Comment 16 Adam Williamson 2009-11-04 23:09:39 UTC
michael: is your system x86-64 capable? if so, I can provide you a recent-enough live image.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 17 Michael Monreal 2009-11-04 23:27:40 UTC
(In reply to comment #16)
> michael: is your system x86-64 capable? 

Yes it is. Just point me to the iso, thx

Comment 18 Adam Williamson 2009-11-04 23:58:20 UTC
I'm uploading it at present, will have it up for you in around 2-3 hours.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 19 Adam Williamson 2009-11-05 00:33:54 UTC
i put it on my own server to get it up faster. please grab it here:

http://www.happyassassin.net/extras/bleeding-20091104-1-x86_64.iso

please let me know when you're done, as I need to take it straight back down again. thanks!

please, no-one else download this, my transfer quota couldn't cope.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 20 Adam Williamson 2009-11-05 02:54:31 UTC
Michael, you'll note we've de-duped your bug and Ben thinks he knows what's wrong and how to fix.

After discussing with Ben, it turns out most of the bugs marked as 'duplicates' of this were not, in fact, duplicates of it. Matt is the only person we know to be actually hitting this particular problem. Given that, and the fact that nomodeset works as a workaround, we're dropping this from being a blocker.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 21 Carl van Tonder 2009-11-05 12:37:48 UTC
I also experience this issue on an nvidia 6150 Go, and the new package causes a different (and more serious) crash without nomodeset (nomodeset causes black window problems); will attempt to get logs post-haste.

Comment 22 Carl van Tonder 2009-11-05 13:05:37 UTC
Created attachment 367608 [details]
relevant portion of messages  just before crash

Comment 23 Matěj Cepl 2009-11-05 17:18:43 UTC
Since this bugzilla report was filed, there have been several major updates in various components of the Xorg system, which may have resolved this issue. Users who have experienced this problem are encouraged to upgrade their system to the latest version of their packages (at least F12Beta, but even better if the very latest versions).

Please, if you experience this problem on the up-to-date system, let us now in the comment for this bug, or whether the upgraded system works for you.

If you won't be able to reply in one month, I will have to close this bug as INSUFFICIENT_DATA. Thank you.

[This is a bulk message for all open Fedora Rawhide Xorg-related bugs. I'm adding myself to the CC list for each bug, so I'll see any comments you make after this and do my best to make sure every issue gets proper attention.]

Comment 24 Adam Williamson 2009-11-05 18:27:25 UTC
Carl, thanks for reporting, please ignore the canned message from Matej. Can you provide the output of:

rpm -q xorg-x11-drv-nouveau
rpm -q xorg-x11-server-Xorg
rpm -q libdrm
uname -r

on the affected system, so we know exactly what versions of everything you're running? Thanks.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 25 Adam Williamson 2009-11-05 18:29:55 UTC
carl: oh, and your /var/log/Xorg.0.log if you can get it of course. if you're available to discuss this on IRC that would be GREAT, please poke me (adamw) in query or #fedora-devel .

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 26 Matt Domsch 2009-11-05 20:49:13 UTC
Created attachment 367704 [details]
abrt logs

xorg-x11-drv-nouveau-0.0.15-17.20091105gite1c2efd.fc12.x86_64
xorg-x11-server-Xorg-1.7.1-6.fc12.x86_64
libdrm-2.4.15-4.fc12.x86_64
2.6.31.5-117.fc12.x86_64

Comment 27 Matt Domsch 2009-11-05 20:49:53 UTC
Created attachment 367705 [details]
messages

Comment 28 Matt Domsch 2009-11-05 20:50:39 UTC
Created attachment 367706 [details]
Xorg.0.log

This was the Xorg log while it was failing.

Comment 29 Adam Williamson 2009-11-05 21:27:04 UTC
It seems that Matt's issue is clearly different from Carl's. Carl, can you file a new bug and give us the bug number? Thanks. Include all the info I asked for.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 30 James Laska 2009-11-05 21:59:19 UTC
Created attachment 367739 [details]
Xorg.0.log - [Quadro NVS 285] 

Also seeing this error on a 

0a:00.0 VGA compatible controller: nVidia Corporation NV44 [Quadro NVS 285] (rev a1)

= Packages =

kernel-2.6.31.5-117.fc12.i686
xorg-x11-server-Xorg-1.7.1-6.fc12.i686
xorg-x11-drv-nouveau-0.0.15-17.20091105gite1c2efd.fc12.i686

= dmesg =

[drm] nouveau 0000:0a:00.0: PGRAPH_ERROR - nSource: (unknown bits 0x00400000), nStatus: PROTECTION_FAULT
[drm] nouveau 0000:0a:00.0: PGRAPH_ERROR - Ch 1/4 Class 0x4497 Mthd 0x1808 Data 0x00000000:0x00000000

= Xorg.0.log =

See attached

Comment 31 Adam Williamson 2009-11-05 22:32:26 UTC
Carl and James, we will follow your issues up in 529292 for now. Please move to that bug. We'll keep this one for Matt's issue, which is different.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 32 k.t. 2009-11-12 23:37:41 UTC
I had same problem while trying to install Fedora 12 RC today, used nomodeset as workaround and after running update the problem disappeared. Right now using:

xorg-x11-drv-nouveau-1:0.0.15-17.20091105gite1c2efd.fc12 (i686)

lspci -vv
01:00.0 VGA compatible controller: nVidia Corporation NV43 [GeForce Go 6200/6400] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: Mitac Device 8054
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at c0000000 (32-bit, non-prefetchable) [size=16M]
	Region 1: Memory at 90000000 (64-bit, prefetchable) [size=256M]
	Region 3: Memory at c1000000 (64-bit, non-prefetchable) [size=16M]
	Expansion ROM at cff00000 [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: nouveau
	Kernel modules: nouveau, nvidiafb

Comment 33 Adam Williamson 2009-11-13 06:26:12 UTC
You have very different hardware, cooling.crystals, this is almost certainly not the same bug. Please file a new report, and include the information described in:

https://fedoraproject.org/wiki/How_to_debug_Xorg_problems

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 34 Bug Zapper 2009-11-16 13:59:06 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle.
Changing version to '12'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 35 Fedora Update System 2009-12-23 01:40:32 UTC
libdrm-2.4.15-8.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/libdrm-2.4.15-8.fc12

Comment 36 Bug Zapper 2010-11-04 09:18:26 UTC
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 37 Bug Zapper 2010-12-04 07:25:08 UTC
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.