Bug 478866

Summary: random freezes/crashes: BUG: spinlockup on CPU#0, pulseaudio/3263, ffff88005911e010 ......
Product: [Fedora] Fedora Reporter: Tom London <selinux>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: rawhideCC: kernel-maint, kmcmartin, quintela, zaitcev
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-04-15 23:04:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
picture of screen with BUG message none

Description Tom London 2009-01-05 18:51:55 UTC
Created attachment 328222 [details]
picture of screen with BUG message

Description of problem:
I've been getting "random" freezes running kernel-2.6.29-0.9.rc0.git4.fc11.x86_64.

Most of the time, I get nothing in the logs, but I just got one that spewed the following on my console windows just before completely freezing:

kernel:BUG: spinlock lockup on CPU#0, pulseaudio/3263, ffff88005911e010 (Tainted: G       W)

Hard rebooting (power cycle/etc.) again shows nothing in /var/log/messages.

System is Thinkpad X61.  Typically, when the freeze occurs, I get the "repeat .25 seconds of sound clip" repeating constantly.

I attach a picture of the above message.

Version-Release number of selected component (if applicable):
kernel-2.6.29-0.9.rc0.git4.fc11.x86_64

How reproducible:
Repeatable, but random

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Tom London 2009-01-05 19:18:35 UTC
Believe this crash was kerneloops'ed here: http://www.kerneloops.org/raw.php?rawid=167853&msgid=

The complete "dump" reported there:

BUG: spinlock lockup on CPU#0, pulseaudio/3263, ffff88005911e010 (Tainted: G        W )
Pid: 3263, comm: pulseaudio Tainted: G        W  2.6.29-0.9.rc0.git4.fc11.x86_64 #1
Call Trace:
 [<ffffffff811976a0>] _raw_spin_lock+0xeb/0x116
 [<ffffffff8137919e>] _spin_lock_irqsave+0x6c/0x83
 [<ffffffff8105dce1>] ? posix_timer_fn+0x25/0xb9
 [<ffffffff8105dcbc>] ? posix_timer_fn+0x0/0xb9
 [<ffffffff8105dce1>] posix_timer_fn+0x25/0xb9
 [<ffffffff8105dcbc>] ? posix_timer_fn+0x0/0xb9
 [<ffffffff810608e9>] __run_hrtimer+0x8a/0xc0
 [<ffffffff81060a32>] enqueue_hrtimer+0x113/0x15b
 [<ffffffff81061246>] hrtimer_start_range_ns+0xe9/0x10b
 [<ffffffff81198713>] ? debug_object_init+0x14/0x19
 [<ffffffff8105d74e>] sys_timer_settime+0x258/0x2c0
 [<ffffffff8101130a>] ? sysret_check+0x46/0x81
 [<ffffffff8106c4e0>] ? trace_hardirqs_on_caller+0x1f/0x153
 [<ffffffff810112ba>] system_call_fastpath+0x16/0x1b

Comment 2 Tom London 2009-01-05 19:20:36 UTC
Believe these are related too:

http://www.kerneloops.org/raw.php?rawid=165675&msgid=

and

http://www.kerneloops.org/raw.php?rawid=166073&msgid=

Comment 3 antonio montagnani 2009-01-06 13:03:21 UTC
it happens also with following kernels:

kernel-2.6.29-0.12.rc0.git7.fc11.i686 and intermediates from the reporter's kernel and latest

Comment 4 Tom London 2009-01-06 18:22:31 UTC
Just got another of these, this time the system didn't freeze immediately.

This is against kernel-2.6.29-0.12.rc0.git7.fc11.x86_64

BUG: spinlock lockup on CPU#0, qemu-kvm/3706, ffff8800519d7150 (Tainted: G        W )
Pid: 3706, comm: qemu-kvm Tainted: G        W  2.6.29-0.12.rc0.git7.fc11.x86_64 #1
Call Trace:
 [<ffffffff811989f8>] _raw_spin_lock+0xeb/0x116
 [<ffffffff8137b7ae>] _spin_lock_irqsave+0x6c/0x83
 [<ffffffff8105e939>] ? posix_timer_fn+0x25/0xb9
 [<ffffffff8105e914>] ? posix_timer_fn+0x0/0xb9
 [<ffffffff8105e939>] posix_timer_fn+0x25/0xb9
 [<ffffffff8105e914>] ? posix_timer_fn+0x0/0xb9
 [<ffffffff8106155d>] __run_hrtimer+0x8a/0xc0
 [<ffffffff810616a6>] enqueue_hrtimer+0x113/0x15b
 [<ffffffff81061eba>] hrtimer_start_range_ns+0xe9/0x10b
 [<ffffffff81199a6b>] ? debug_object_init+0x14/0x19
 [<ffffffff8105e3a6>] sys_timer_settime+0x258/0x2c0
 [<ffffffff8101138a>] ? sysret_check+0x46/0x81
 [<ffffffff8106d0db>] ? trace_hardirqs_on_caller+0x1f/0x153
 [<ffffffff8101133a>] system_call_fastpath+0x16/0x1b
[root@tlondon ~]#

Comment 5 Tom London 2009-01-06 18:36:47 UTC
The system froze completely within 30 seconds of filing the above comment: first the affected process (qemu-kvm) became unresponsive, followed by the system as a whole (cursor froze, keyboard unresponsive, etc.).

Because the system stayed up for a bit, when I rebooted, I discovered the following in /var/log/messages:

Jan  6 10:17:22 tlondon kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Jan  6 10:17:22 tlondon kernel: ata4.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Jan  6 10:17:22 tlondon kernel:         cdb 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
Jan  6 10:17:22 tlondon kernel:         res 40/00:02:00:08:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
Jan  6 10:17:22 tlondon kernel: ata4.00: status: { DRDY }
Jan  6 10:17:22 tlondon kernel: ata4: soft resetting link
Jan  6 10:17:22 tlondon kernel: ata4.00: configured for UDMA/33
Jan  6 10:17:27 tlondon kernel: ata4.00: qc timeout (cmd 0xa0)
Jan  6 10:17:27 tlondon kernel: ata4.00: TEST_UNIT_READY failed (err_mask=0x5)
Jan  6 10:17:27 tlondon kernel: ata4: soft resetting link
Jan  6 10:17:27 tlondon kernel: ata4.00: configured for UDMA/33
Jan  6 10:17:32 tlondon kernel: ata4.00: qc timeout (cmd 0xa0)
Jan  6 10:17:32 tlondon kernel: ata4.00: TEST_UNIT_READY failed (err_mask=0x5)
Jan  6 10:17:32 tlondon kernel: ata4.00: limiting speed to UDMA/33:PIO3
Jan  6 10:17:32 tlondon kernel: ata4: soft resetting link
Jan  6 10:17:33 tlondon kernel: ata4.00: configured for UDMA/33
Jan  6 10:17:38 tlondon kernel: ata4.00: qc timeout (cmd 0xa0)
Jan  6 10:17:38 tlondon kernel: ata4.00: TEST_UNIT_READY failed (err_mask=0x5)
Jan  6 10:17:38 tlondon kernel: ata4.00: disabled
Jan  6 10:17:38 tlondon kernel: ata4: soft resetting link
Jan  6 10:17:38 tlondon kernel: ata4: EH complete
Jan  6 10:19:50 tlondon kerneloops: Submitted 1 kernel oopses to www.kerneloops.org
Jan  6 10:20:08 tlondon kernel: BUG: spinlock lockup on CPU#0, qemu-kvm/3706, ffff8800519d7150 (Tainted: G        W )
Jan  6 10:20:08 tlondon kernel: Pid: 3706, comm: qemu-kvm Tainted: G        W  2.6.29-0.12.rc0.git7.fc11.x86_64 #1
Jan  6 10:20:08 tlondon kernel: Call Trace:
Jan  6 10:20:08 tlondon kernel: [<ffffffff811989f8>] _raw_spin_lock+0xeb/0x116
Jan  6 10:20:08 tlondon kernel: [<ffffffff8137b7ae>] _spin_lock_irqsave+0x6c/0x83
Jan  6 10:20:08 tlondon kernel: [<ffffffff8105e939>] ? posix_timer_fn+0x25/0xb9
Jan  6 10:20:08 tlondon kernel: [<ffffffff8105e914>] ? posix_timer_fn+0x0/0xb9
Jan  6 10:20:08 tlondon kernel: [<ffffffff8105e939>] posix_timer_fn+0x25/0xb9
Jan  6 10:20:08 tlondon kernel: [<ffffffff8105e914>] ? posix_timer_fn+0x0/0xb9
Jan  6 10:20:08 tlondon kernel: [<ffffffff8106155d>] __run_hrtimer+0x8a/0xc0
Jan  6 10:20:08 tlondon kernel: [<ffffffff810616a6>] enqueue_hrtimer+0x113/0x15b
Jan  6 10:20:08 tlondon kernel: [<ffffffff81061eba>] hrtimer_start_range_ns+0xe9/0x10b
Jan  6 10:20:08 tlondon kernel: [<ffffffff81199a6b>] ? debug_object_init+0x14/0x19
Jan  6 10:20:08 tlondon kernel: [<ffffffff8105e3a6>] sys_timer_settime+0x258/0x2c0
Jan  6 10:20:08 tlondon kernel: [<ffffffff8101138a>] ? sysret_check+0x46/0x81
Jan  6 10:20:08 tlondon kernel: [<ffffffff8106d0db>] ? trace_hardirqs_on_caller+0x1f/0x153
Jan  6 10:20:08 tlondon kernel: [<ffffffff8101133a>] system_call_fastpath+0x16/0x1b
Jan  6 10:26:39 tlondon kernel: imklog 3.21.9, log source = /proc/kmsg started.

My naive reading of the above is that the system thinks the hard drive "froze", it then tried to "soft reset" which appeared to work, at least for a bit.......

Can someone elucidate further (please!).

Comment 6 Kyle McMartin 2009-02-26 16:47:13 UTC
Still seeing this with a more recent kernel, Tom?

This looks like the same as #479476.

cheers, Kyle

Comment 7 Kyle McMartin 2009-02-26 16:47:52 UTC
*** Bug 479476 has been marked as a duplicate of this bug. ***

Comment 8 Tom London 2009-02-26 17:01:56 UTC
Nope.

I haven't seen this since the above report.

My logs go back to 25 January.

Close?

Comment 9 Kyle McMartin 2009-02-26 17:06:50 UTC
I'll wait to see what Zaitcev says. I suspect it was just pre-rc1 merge teething pains though.

cheers, Kyle