Created attachment 337422 [details] contents of /var/log Description of problem: I have no idea what is causing this but F11 Beta was stable, after applying updates the first day after the freeze was lifted my machine started locking up frequently. I cannot figure out the trigger but I have included all the logs present in /var/log in the hopes that will narrow things down some more. How reproducible: 100% on this machine Steps to Reproduce: 1. install F11 Beta 2. update Actual results: frequent lockups will be experienced Expected results: business as usual Additional info: x86_64, da_DK.UTF-8
I am not so sure this is a lock up as such, the cursor can be moved however nothing responds when clicked upon. I can't switch to a VT and letting the machine sit for hours like this does not bring it out of this state. The only recovery seems to be a hard reset and seeing how many delightful minutes of computing one can get done till it happens next.
This happens very quickly for me, on a Toshiba laptop. It doesn't seem to matter whether I am actively using it or not. Some debugging -- the Xorg process seems to be stuck and the strace looks like: ... sigreturn() = ? (mask now []) --- SIGALRM (Alarm clock) @ 0 (0) --- sigreturn() = ? (mask now []) --- SIGALRM (Alarm clock) @ 0 (0) --- sigreturn() = ? (mask now []) --- SIGALRM (Alarm clock) @ 0 (0) --- sigreturn() = ? (mask now []) --- SIGALRM (Alarm clock) @ 0 (0) --- sigreturn() = ? (mask now []) --- SIGALRM (Alarm clock) @ 0 (0) --- sigreturn() = ? (mask now []) --- SIGALRM (Alarm clock) @ 0 (0) --- sigreturn() = ? (mask now []) --- SIGALRM (Alarm clock) @ 0 (0) --- sigreturn() = ? (mask now []) --- SIGALRM (Alarm clock) @ 0 (0) --- sigreturn() = ? (mask now []) --- SIGALRM (Alarm clock) @ 0 (0) --- sigreturn() = ? (mask now []) --- SIGALRM (Alarm clock) @ 0 (0) --- sigreturn() = ? (mask now []) --- SIGALRM (Alarm clock) @ 0 (0) --- ...
Would that laptop happen to have an nvidia card like mine or can we cross off nouveau?
That laptop would have an Nvidia card. The lspci output looks like: 01:00.0 VGA compatible controller: nVidia Corporation G72M [Quadro NVS 110M] (rev a1) Sorry, can't cross off nouveau, I guess.
01:00.0 VGA compatible controller: nVidia Corporation G72M [Quadro NVS 110M/GeForce Go 7300] (rev a1) Curious coincidence. Help us Obi-wan Skeggs, you're our only hope
It'd be useful if you could try the nv driver and see if this still happens there? I'll have a look at the updates between the beta and latest updates, but I can't think of anything off the top of my head that could have caused this.
I see quite a lot of these: Apr 1 23:23:52 localhost kernel: [drm] PGRAPH_ERROR - nSource: DATA_ERROR, nStatus: INVALID_STATE BAD_ARGUMENT Apr 1 23:23:52 localhost kernel: [drm] PGRAPH_ERROR - Ch 1/0 Class 0x0062 Mthd 0x0308 Data 0x052a0b00:0x7fff7fff and this, however nouveau.modeset=1 is not set Apr 2 00:00:23 localhost kernel: [drm:nouveau_load] *ERROR* Kernel modesetting requested but not supported on this chipset. Apr 2 00:25:07 localhost kernel: [drm:nouveau_load] *ERROR* Kernel modesetting requested but not supported on this chipset. Regardless as requested I have switched to nv, let's see if it will crash.
It's definitely nouveau in your case then, the GPU is reporting a lot of errors to the driver. Oops, I'll fix that mistake with the KMS warnings now. Thanks you!
Yeah, the machine has now been up for nearly 2 hours which is, sadly, unpresidented ever since the Beta freeze was lifted. Any additional information you need?
Nope, that will be enough info for the moment I think. I'll see what I can find out.
Hmm. My symptoms looked like David's, but with more looking, not really. My system does not exhibit the kernel errors like his. Instead, I see the Xorg process looping as described in Comment #2. I will try the nv driver though. (Just to be sure, how do I do this?)
I tried reabling nouveau after seeing a few upgrades, but as of: kernel-2.6.29.1-46.fc11.x86_64 xorg-x11-drv-nouveau-0.0.12-22.20090404git836d985.fc11.x86_64 This still happens and I still get complaints that modesetting was requested despite it not being the case. Regardless nv is rocksolid for me and has been for days.
Are you both able to test kernel-2.6.29-16.fc11 (http://koji.fedoraproject.org/koji/buildinfo?buildID=95660) and kernel-2.6.29-21.fc11 (http://koji.fedoraproject.org/koji/buildinfo?buildID=95835) and report which (if any) of these kernels work better. I'm only seeing one likely candidate so far. Thanks!
I am still seeing this behaviour with -16, -21 testing is up next.
-21 is showing good behavior, no PGRAPH_ERROR errors in the logs. It's only 2 hours into the testing but considering how quickly it was triggered before it should have popped up by now.
and the very definition of irony: 5 mins after saying it was doing well, the hang occured on -21 as well
I seem to get these hangs as well with the nouveau driver. On the other hand, the nv driver works fine. Hardware info: http://www.smolts.org/client/show/pub_c40b091e-a50f-4fbd-949c-8fa330d8bde5 I'll post dmesg output and Xorg.0.log as attachments shortly.
Created attachment 339212 [details] Xorg.0.log, read via SSH after hang.
Created attachment 339213 [details] Dmesg output after X hanging.
Can you give the kernel from the f11 beta (2.6.29-0.258.2.3.rc8.git2.fc11) a try to confirm the issue is definitely on the kernel side. If the issues still occur there, downgrading to xorg-x11-drv-nouveau-0.0.12-10.20090310git8f9a580.fc11 would be useful to see also.
I'm sorry, but at the moment, I don't have access to the machine where I encountered these hangs. Thanks
I see hangs with 0.0.12-25 and 26: can provide chipset details if it helps.
I can also try the latest builds with F11Beta Live and see if that hangs too.
nv driver also seems fine to me.
(I just note for the record that -27 also hangs for me.)
(In reply to comment #20) > Can you give the kernel from the f11 beta (2.6.29-0.258.2.3.rc8.git2.fc11) a > try to confirm the issue is definitely on the kernel side. Ok I just tried -27 with F11Beta Live and that hung quickly for me too.
(In reply to comment #20) > If the issues still occur there, downgrading to > xorg-x11-drv-nouveau-0.0.12-10.20090310git8f9a580.fc11 > would be useful to see also. That looks to be fine so far with current rawhide.
Some sample output from messages: # grep nouveau /var/log/messages : Apr 15 16:34:39 localhost yum: Updated: 1:xorg-x11-drv-nouveau-0.0.12-27.20090413git7100c06.fc11.i586 <reboot> Apr 15 16:38:24 localhost kernel: nouveau 0000:01:00.0: Detected an NV44 generation card (0x044500a2) Apr 15 16:38:24 localhost kernel: [drm] Initialized nouveau 0.0.12 20060213 for 0000:01:00.0 on minor 0 Apr 15 16:38:33 localhost kernel: nouveau 0000:01:00.0: Allocating FIFO number 0 Apr 15 16:38:33 localhost kernel: nouveau 0000:01:00.0: nouveau_fifo_alloc: initialised FIFO 0 Apr 15 16:38:33 localhost kernel: nouveau 0000:01:00.0: Allocating FIFO number 1 Apr 15 16:38:33 localhost kernel: nouveau 0000:01:00.0: nouveau_fifo_alloc: initialised FIFO 1 Apr 15 16:39:42 localhost kernel: nouveau 0000:01:00.0: PFIFO_CACHE_ERROR - Ch 1/6 Mthd 0x0184 Data 0xffffffff Apr 15 16:39:42 localhost kernel: nouveau 0000:01:00.0: PFIFO_CACHE_ERROR - Ch 1/6 Mthd 0x0188 Data 0x32222222 Apr 15 16:39:44 localhost kernel: nouveau 0000:01:00.0: PGRAPH_ERROR - nSource: DATA_ERROR, nStatus: BAD_ARGUMENT Apr 15 16:39:44 localhost kernel: nouveau 0000:01:00.0: PGRAPH_ERROR - Ch 1/1 Class 0x004a Mthd 0x0300 Data 0x00000000:0x00000000 Apr 15 16:39:50 localhost kernel: nouveau 0000:01:00.0: PFIFO_DMA_PUSHER - Ch 1 <hang>
Ok, just to confirm. xorg-x11-drv-nouveau -10 is working OK with whichever kernel? That gives me somewhere else to look :) All the bug reports seem to have blamed the kernel, and I was running out of ideas as to what changed there that could possibly cause this.
Here's how I can reproduce it (which I mentioned on bug #473347): 1. Open Firefox, or Gnome Help 2. Select some text 3. drag the text, such that the copy icon is the cursor at this point, the cursor stays like that, and clicks have no effect, I can't switch to a VT, and ctrl+alt+del doesn't do anything. I seem to have to do a hard power-off. I'm running on a ThinkPad T61 with nVidia Quadro NVS 140m.
(In reply to comment #30) > Here's how I can reproduce it (which I mentioned on bug #473347): > 1. Open Firefox, or Gnome Help > 2. Select some text > 3. drag the text, such that the copy icon is the cursor > > at this point, the cursor stays like that, and clicks have no effect, I can't > switch to a VT, and ctrl+alt+del doesn't do anything. I seem to have to do a > hard power-off. > > > I'm running on a ThinkPad T61 with nVidia Quadro NVS 140m. This issue isn't related, more likely to rh#489101... Everyone else, this should be fixed now as of libdrm-2.4.6-6.fc11 and xorg-x11-drv-nouveau-0.0.12-29.20090417gitfa2f111.fc11. I'll close after a couple of confirmations :)
Looks good to me. Thanks
Seems to have sent the crasher to the wild green yonder.
Sounds like this is fixed. Closing. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers