710783 – periodic system freeze on ASUS 1201N

Bug 710783 - periodic system freeze on ASUS 1201N

Summary: periodic system freeze on ASUS 1201N

Keywords:
Status:	CLOSED DUPLICATE of bug 755154
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	xorg-x11-drv-nouveau
Sub Component:
Version:	15
Hardware:	i686
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Ben Skeggs
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2011-06-04 18:18 UTC by Brad
Modified:	2011-12-06 06:56 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2011-12-06 06:56:05 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Smolt Output for affected notebook (2.03 KB, text/plain) 2011-06-04 18:24 UTC, Brad	no flags	Details
dmesg output after normal boot (before freeze) (68.14 KB, text/plain) 2011-06-04 18:30 UTC, Brad	no flags	Details
post crash with drm.debug=14 log_buf_lef=16M dmesg (418.54 KB, text/plain) 2011-06-04 20:10 UTC, Brad	no flags	Details
post crash with drm.debug=14 log_buf_lef=16M /var/log/messages (247.59 KB, text/plain) 2011-06-04 20:11 UTC, Brad	no flags	Details
post crash with drm.debug=14 log_buf_lef=16M /var/log/Xorg.0.log (30.69 KB, text/x-log) 2011-06-04 20:12 UTC, Brad	no flags	Details
View All

Description Brad 2011-06-04 18:18:59 UTC

Stock Fedora 15 on an ASUS EEE PC 1201N notebook completely hangs about every 2 hours.

Screen (always), keyboard (always), mouse & cursor (sometimes, not always), sound (sometimes, not always), and remote ssh connections all hang.

I am attributing this to Nouveau compositing in gnome-3, because booting with nouveau.accel=0 seems to make the hangs go away (it deploys the fallback window manager in gnome 3).

In a remote session, I ran "tail -f" against /var/log/messages, /var/log/Xorg.0.log, /var/log/gdm/:0.log, $HOME/.xsession-errors, and at the time of a freeze, no messages were seen.

Packages:
kernel-2.6.38.6-27.fc15.i686
gnome-session-3.0.1-2.fc15.i686
xorg-x11-drv-nouveau-0.0.16-24.20110324git8378443.fc15.i686

lspci:
05:00.0 VGA compatible controller: nVidia Corporation ION VGA [GeForce 9400M] (rev b1)

Comment 1 Brad 2011-06-04 18:24:26 UTC

Created attachment 503002 [details]
Smolt Output for affected notebook

Comment 2 Brad 2011-06-04 18:30:24 UTC

Created attachment 503003 [details]
dmesg output after normal boot (before freeze)

Comment 3 Brad 2011-06-04 18:49:01 UTC

Noticing Matej Cepl's suggestion in bug 674986 (Comment #7), I am running with drm.debug=0x04, and will post additional attachments resulting therefrom.

Comment 4 Brad 2011-06-04 20:10:12 UTC

Created attachment 503010 [details]
post crash with drm.debug=14 log_buf_lef=16M dmesg

Comment 5 Brad 2011-06-04 20:11:22 UTC

Created attachment 503011 [details]
post crash with drm.debug=14 log_buf_lef=16M /var/log/messages

Comment 6 Brad 2011-06-04 20:12:16 UTC

Created attachment 503012 [details]
post crash with drm.debug=14 log_buf_lef=16M /var/log/Xorg.0.log

Note this host has no /etc/X11/xorg.conf file.

Comment 7 Brad 2011-06-04 20:20:35 UTC

The above hang at Jun 4 12:53:55 (log file time) occurred while watching a youtube video.  Mouse and cursor froze, keyboard froze, screen froze.  Sound continued playing.  Only recourse (AFAIK) was hard powercycle reboot.  I see nothing in the log files from immediately before the crash.  If there is something else I can provide I'm happy to do so (the crash is easily reproducible on this machine).

Comment 8 Brad 2011-12-06 06:56:05 UTC

I was able to reproduce the hang on the debug kernel.  Debug kernel output shows that this is likely a spin deadlock in rtlwifi.  I think it's therefore a duplicate of Bug 755154.

[  128.809412] =================================
[  128.809426] [ INFO: inconsistent lock state ]
[  128.809436] 3.1.4-1.fc16.i686.PAEdebug #1
[  128.809444] ---------------------------------
[  128.809452] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[  128.809464] kworker/3:1/34 [HC0[0]:SC0[0]:HE1:SE1] takes:
[  128.809473]  (&(&rtlpriv->locks.lps_lock)->rlock){+.?...}, at: [<f7f5c331>] rtl_lps_leave+0x20/0xeb [rtlwifi]
[  128.809518] {IN-SOFTIRQ-W} state was registered at:
[  128.809526]   [<c047935a>] __lock_acquire+0x275/0xb63
[  128.809546]   [<c047a0c9>] lock_acquire+0xde/0x11d
[  128.809560]   [<c086febb>] _raw_spin_lock+0x45/0x72
[  128.809578]   [<f7f5c331>] rtl_lps_leave+0x20/0xeb [rtlwifi]
[  128.809607]   [<f7f5dcbb>] _rtl_pci_ips_leave_tasklet+0xd/0xf [rtlwifi]
[  128.809636]   [<c0451717>] tasklet_action+0x74/0xc2
[  128.809652]   [<c0451872>] __do_softirq+0xdd/0x203
[  128.809667] irq event stamp: 21747
[  128.809675] hardirqs last  enabled at (21747): [<c08706f1>] _raw_spin_unlock_irq+0x27/0x39
[  128.809693] hardirqs last disabled at (21746): [<c086ff87>] _raw_spin_lock_irq+0x19/0x7c
[  128.809710] softirqs last  enabled at (21716): [<c045194f>] __do_softirq+0x1ba/0x203
[  128.809726] softirqs last disabled at (21711): [<c040fd8e>] do_softirq+0x63/0xb8
[  128.809742] 
[  128.809745] other info that might help us debug this:
[  128.809753]  Possible unsafe locking scenario:
[  128.809757] 
[  128.809764]        CPU0
[  128.809771]        ----
[  128.809777]   lock(&(&rtlpriv->locks.lps_lock)->rlock);
[  128.809791]   <Interrupt>
[  128.809797]     lock(&(&rtlpriv->locks.lps_lock)->rlock);
[  128.809810] 
[  128.809812]  *** DEADLOCK ***
[  128.809815] 
[  128.809824] 2 locks held by kworker/3:1/34:
[  128.809831]  #0:  (rtlpriv->cfg->name){.+.+..}, at: [<c0462473>] process_one_work+0x12c/0x37c
[  128.809860]  #1:  ((&(&rtlpriv->works.watchdog_wq)->work)){+.+...}, at: [<c0462473>] process_one_work+0x12c/0x37c
[  128.809885] 
[  128.809888] stack backtrace:
[  128.809899] Pid: 34, comm: kworker/3:1 Not tainted 3.1.4-1.fc16.i686.PAEdebug #1
[  128.809908] Call Trace:
[  128.809925]  [<c08670fc>] ? printk+0x2d/0x2f
[  128.809940]  [<c0867896>] print_usage_bug+0x1c0/0x1ca
[  128.809957]  [<c0478fdf>] mark_lock+0xec/0x1f2
[  128.809972]  [<c0478a06>] ? check_usage_forwards+0x94/0x94
[  128.809987]  [<c04793c6>] __lock_acquire+0x2e1/0xb63
[  128.810004]  [<c046bdbf>] ? sched_clock_cpu+0x134/0x144
[  128.810022]  [<c04774c1>] ? register_lock_class+0x16/0x23a
[  128.810055]  [<f7f5c331>] ? rtl_lps_leave+0x20/0xeb [rtlwifi]
[  128.810074]  [<c047a0c9>] lock_acquire+0xde/0x11d
[  128.810077]  [<f7f5c331>] ? rtl_lps_leave+0x20/0xeb [rtlwifi]
[  128.810077]  [<c086febb>] _raw_spin_lock+0x45/0x72
[  128.810077]  [<f7f5c331>] ? rtl_lps_leave+0x20/0xeb [rtlwifi]
[  128.810077]  [<f7f5c331>] rtl_lps_leave+0x20/0xeb [rtlwifi]
[  128.810077]  [<c04774a5>] ? lock_release_holdtime.part.9+0x4b/0x51
[  128.810077]  [<f7f55718>] rtl_watchdog_wq_callback+0x1ae/0x258 [rtlwifi]
[  128.810077]  [<c0462529>] process_one_work+0x1e2/0x37c
[  128.810077]  [<c0462473>] ? process_one_work+0x12c/0x37c
[  128.810077]  [<f7f5556a>] ? rtl_watch_dog_timer_callback+0x42/0x42 [rtlwifi]
[  128.810077]  [<c046306b>] worker_thread+0xb9/0x133
[  128.810077]  [<c0462fb2>] ? manage_workers+0x154/0x154
[  128.810077]  [<c04661a1>] kthread+0x72/0x77
[  128.810077]  [<c046612f>] ? __init_kthread_worker+0x4a/0x4a
[  128.810077]  [<c0877042>] kernel_thread_helper+0x6/0x10

*** This bug has been marked as a duplicate of bug 755154 ***

Note You need to log in before you can comment on or make changes to this bug.