Bug 694936 - kernel-2.6.38.2-13.fc15.x86_64 hard locks on Geforce 9400 GT (10de:0641)
Summary: kernel-2.6.38.2-13.fc15.x86_64 hard locks on Geforce 9400 GT (10de:0641)
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-drv-nouveau
Version: 15
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Ben Skeggs
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-04-08 23:33 UTC by Adam Williamson
Modified: 2011-04-15 15:09 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-04-15 11:55:47 UTC
Type: ---


Attachments (Terms of Use)
Dmesg of lattitude 6400 + nvs160m (124.19 KB, text/plain)
2011-04-09 18:50 UTC, Hans de Goede
no flags Details
xorg.log of lattitude 6400 + nvs160m (35.97 KB, text/plain)
2011-04-09 18:53 UTC, Hans de Goede
no flags Details

Description Adam Williamson 2011-04-08 23:33:28 UTC
See summary. I installed kernel-2.6.38.2-13.fc15.x86_64, booted it twice, it hard locked (cursor stuck, couldn't switch to a vt, couldn't ssh in) within ten minutes each time. Back to -9 and that hasn't locked in an hour of use, so I'm pretty sure it's the kernel.

/var/log/messages doesn't have anything at the time of the lock, it just stops. I can try again with drm.debug if needed, but I know Ben has this same adapter so he may be able to reproduce easily.

Comment 1 Hans de Goede 2011-04-09 18:33:36 UTC
I'm seeing the exact same thing (did not try to ssh in though) on my Dell latitude 6400 laptop with a Quadro NVS 160M (G98M), which is also an NV50 card and actually is pretty close to the 9400GT all around.

I'm quite experienced with most forms of debugging let me know if there is anything I can do help. I'm hansg@freenode on irc.

Comment 2 Hans de Goede 2011-04-09 18:50:20 UTC
Created attachment 490987 [details]
Dmesg of lattitude 6400 + nvs160m

Comment 3 Hans de Goede 2011-04-09 18:53:46 UTC
Created attachment 490988 [details]
xorg.log of lattitude 6400 + nvs160m

Note I believe I've been seeing this since 2.6.38.2-11 (-10 is doa, -9 is ok), but I did not file this bug before because I wasn't sure that this did not happen with -9. However I'm pretty confident now that this is a regression from -9.

Note 2: The attached logs contains some vc / vt switches. This happens without them too, this was just me switching to a text vc to scp the logs away from the
system, since sometimes it does not even get enough uptime to browse to a bug and attach logs (when running -13).

Comment 4 Adam Williamson 2011-04-11 07:03:23 UTC
Backtrace (via netconsole):

[11299.704013] ------------[ cut here ]------------
[11299.704013] WARNING: at kernel/watchdog.c:226 watchdog_overflow_callback+0x9b/0xa6()
[11299.704013] Hardware name: System Product Name
[11299.704013] Watchdog detected hard LOCKUP on cpu 0
[11299.704013] Modules linked in: tcp_lp tun fuse netconsole configfs ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM iptable_mangle bridge stp llc coretemp snd_hda_codec_realtek snd_ice1724 snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_hda_intel nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
[11299.704013] Pid: 1554, comm: Xorg Not tainted 2.6.38.2-13.fc15.x86_64 #1
[11299.704013] Call Trace:
[11299.704013]  <NMI>  [<ffffffff81055066>] ? warn_slowpath_common+0x83/0x9b
[11299.704013]  [<ffffffff81055121>] ? warn_slowpath_fmt+0x46/0x48
[11299.704013]  [<ffffffff810ac1a3>] ? watchdog_overflow_callback+0x9b/0xa6
[11299.704013]  [<ffffffff810d3b5d>] ? __perf_event_overflow+0x135/0x191
[11299.704013]  [<ffffffff81016296>] ? paravirt_write_msr+0xf/0x13
[11299.704013]  [<ffffffff810d41b6>] ? perf_event_overflow+0x14/0x16
[11299.704013]  [<ffffffff8101988c>] ? intel_pmu_handle_irq+0x37e/0x3e1
[11299.704013]  [<ffffffff8147655e>] ? perf_event_nmi_handler+0x67/0xb3
[11299.704013]  [<ffffffff81478207>] ? notifier_call_chain+0x37/0x63
[11299.704013]  [<ffffffff8147825f>] ? atomic_notifier_call_chain+0x18/0x1a
[11299.704013]  [<ffffffff8147828f>] ? notify_die+0x2e/0x30
[11299.704013]  [<ffffffff814759f4>] ? do_nmi+0x6d/0x217
[11299.704013]  [<ffffffff81475710>] ? nmi+0x20/0x30
[11299.704013]  [<ffffffff81474c2f>] ? _raw_spin_lock_irqsave+0x27/0x2f
[11299.704013]  <<EOE>>  <IRQ>  [<ffffffffa008c4d7>] ? nouveau_irq_handler+0x4c/0x116 [nouveau]
[11299.704013]  [<ffffffff810ac961>] ? handle_IRQ_event+0x58/0x11f
[11299.704013]  [<ffffffff8101012c>] ? sched_clock+0x9/0xd
[11298.652093] ------------[ cut here ]------------
[11298.652093] WARNING: at kernel/watchdog.c:226 watchdog_overflow_callback+0x9b/0xa6()
[11298.652093] Hardware name: System Product Name
[11298.652093] Watchdog detected hard LOCKUP on cpu 1
[11298.652093] Modules linked in: tcp_lp tun fuse netconsole configfs ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM iptable_mangle bridge stp llc coretemp snd_hda_codec_realtek snd_ice1724 snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_hda_intel[11298.652093]  [<ffffffffa00e1626>] ? nv50_vm_flush_engine+0x27/0x9f [nouveau]
[11298.652093]  [<ffffffffa00b695a>] ? nv84_graph_tlb_flush+0x16a/0x19c [nouveau]
[11298.652093]  [<ffffffffa00e16f5>] ? nv50_vm_flush+0x57/0x6e [nouveau]
[11298.652093]  [<ffffffffa00a6534>] ? nouveau_vm_unmap_at+0xbd/0xcc [nouveau]
[11298.652093]  [<ffffffffa00a655e>] ? nouveau_vm_unmap+0x1b/0x1d [nouveau]
[11298.652093]  [<ffffffffa008deab>] ? nouveau_bo_del_ttm+0x66/0x7b [nouveau]
[11298.652093]  [<ffffffffa006f819>] ? ttm_bo_release_list+0x9d/0xc1 [ttm]
[11298.652093]  [<ffffffffa006f77c>] ? ttm_bo_release_list+0x0/0xc1 [ttm]
[11298.652093]  [<ffffffff8122babf>] ? kref_put+0x43/0x4d
[11298.652093]  [<ffffffffa00701eb>] ? ttm_bo_delayed_delete+0xb3/0x111 [ttm]
[11298.652093]  [<ffffffffa0070249>] ? ttm_bo_delayed_workqueue+0x0/0x31 [ttm]
[11298.652093]  [<ffffffffa0070265>] ? ttm_bo_delayed_workqueue+0x1c/0x31 [ttm]
[11298.652093]  [<ffffffff8106ae83>] ? process_one_work+0x186/0x298
[11298.652093]  [<ffffffff8106b210>] ? worker_thread+0xda/0x15d
[11298.652093]  [<ffffffff8106b136>] ? worker_thread+0x0/0x15d
[11298.652093]  [<ffffffff8106b136>] ? worker_thread+0x0/0x15d
[11298.652093]  [<ffffffff8106ea73>] ? kthread+0x84/0x8c
[11298.652093]  [<ffffffff8100a9e4>] ? kernel_thread_helper+0x4/0x10
[11298.652093]  [<ffffffff8106e9ef>] ? kthread+0x0/0x8c
[11298.652093]  [<ffffffff8100a9e0>] ? kernel_thread_helper+0x0/0x10
[11298.652093] ---[ end trace d29049a9b791b5a5 ]---
[11339.831000] ------------[ cut here ]------------
[11339.831000] WARNING: at kernel/watchdog.c:226 watchdog_overflow_callback+0x9b/0xa6()
[11339.831000] Hardware name: System Product Name
[11339.831000] Watchdog detected hard LOCKUP on cpu 2
[11339.831000] Modules linked in: tcp_lp tun fuse netconsole configfs ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM iptable_mangle bridge stp llc coretemp snd_hda_codec_realtek snd_ice1724 snd_ice17xx_ak4xxx snd_ac97_codec ac97_bus snd_ak4xxx_adda snd_hda_intel drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan]
[11339.831000] Pid: 2723, comm: qemu-kvm Tainted: G        W   2.6.38.2-13.fc15.x86_64 #1
[11339.831000] Call Trace:
[11339.831000]  <NMI>  [<ffffffff81055066>] ? warn_slowpath_common+0x83/0x9b
[11339.831000]  [<ffffffff81055121>] ? warn_slowpath_fmt+0x46/0x48
[11339.831000]  [<ffffffff810ac1a3>] ? watchdog_overflow_callback+0x9b/0xa6
[11339.831000]  [<ffffffff810d3b5d>] ? __perf_event_overflow+0x135/0x191
[11339.831000]  [<ffffffff81016296>] ? paravirt_write_msr+0xf/0x13
[11339.831000]  [<ffffffff810d41b6>] ? perf_event_overflow+0x14/0x16
[11339.831000]  [<ffffffff8101988c>] ? intel_pmu_handle_irq+0x37e/0x3e1
[11339.831000]  [<ffffffff8147655e>] ? perf_event_nmi_handler+0x67/0xb3
[11339.831000]  [<ffffffff81478207>] ? notifier_call_chain+0x37/0x63
[11339.831000]  [<ffffffff8147825f>] ? atomic_notifier_call_chain+0x18/0x1a
[11339.831000]  [<ffffffff8147828f>] ? notify_die+0x2e/0x30
[11339.831000]  [<ffffffff814759f4>] ? do_nmi+0x6d/0x217
[11339.831000]  [<ffffffff81475710>] ? nmi+0x20/0x30
[11339.831000]  <<EOE>> 
[11339.831000] ---[ end trace d29049a9b791b5a6 ]---

Comment 5 Hans de Goede 2011-04-15 11:55:47 UTC
I've been running kernel 2.6.38-2.14:

* Tue Apr 12 2011 Ben Skeggs <bskeggs> 2.6.38-2.14 - nouveau: correct lock ordering problem 

On the machine in question, with gnome-shell the entire day yesterday and
I had 0 lockups, so it seems this fixes this, closing.

Comment 6 Adam Williamson 2011-04-15 15:09:55 UTC
yeah, ditto.


Note You need to log in before you can comment on or make changes to this bug.