Bug 498143

Summary: softlockup in kernel-2.6.29.1-111.fc11
Product: [Fedora] Fedora Reporter: sangu <sangu.fedora>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 11CC: itamar, kernel-maint, me, misieck, steven
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-06-28 12:15:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description sangu 2009-04-29 04:39:55 UTC
Description of problem:
Kernel failure message 1:
BUG: soft lockup - CPU#0 stuck for 61s! [events/0:16]
Modules linked in: vfat fat fuse sco bridge stp llc bnep l2cap bluetooth sunrpc ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 cpufreq_ondemand powernow_k8 freq_table tuner_simple tuner_types lgdt330x dm_multipath uinput nvidia(P) snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_event snd_seq_midi_emul snd_seq snd_hda_codec_nvhdmi snd_hda_codec_realtek snd_emu10k1 dvb_usb_cxusb dib7000p snd_rawmidi dibx000_common snd_ac97_codec dvb_usb ata_generic snd_hda_intel dvb_core serio_raw pcspkr ac97_bus pata_acpi snd_seq_device firewire_ohci snd_hda_codec forcedeth shpchp firewire_core snd_util_mem snd_pcm emu10k1_gp snd_hwdep dib0070 snd_timer crc_itu_t gameport snd soundcore joydev snd_page_alloc asus_atk0110 wmi pata_amd hwmon nouveau drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
CPU 0:
Modules linked in: vfat fat fuse sco bridge stp llc bnep l2cap bluetooth sunrpc ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 cpufreq_ondemand powernow_k8 freq_table tuner_simple tuner_types lgdt330x dm_multipath uinput nvidia(P) snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_event snd_seq_midi_emul snd_seq snd_hda_codec_nvhdmi snd_hda_codec_realtek snd_emu10k1 dvb_usb_cxusb dib7000p snd_rawmidi dibx000_common snd_ac97_codec dvb_usb ata_generic snd_hda_intel dvb_core serio_raw pcspkr ac97_bus pata_acpi snd_seq_device firewire_ohci snd_hda_codec forcedeth shpchp firewire_core snd_util_mem snd_pcm emu10k1_gp snd_hwdep dib0070 snd_timer crc_itu_t gameport snd soundcore joydev snd_page_alloc asus_atk0110 wmi pata_amd hwmon nouveau drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
Pid: 16, comm: events/0 Tainted: P           2.6.29.1-111.fc11.x86_64 #1 System Product Name
RIP: 0010:[<ffffffff8106c672>]  [<ffffffff8106c672>] smp_call_function_many+0x1dc/0x1f6
RSP: 0018:ffff88006e449dd0  EFLAGS: 00000202
RAX: 00000000000008fc RBX: ffff88006e449e20 RCX: ffff88006e449d60
RDX: 0000000000000800 RSI: 00000000000000fc RDI: 0000000000000286
RBP: ffffffff8101219e R08: ffff880001019ab0 R09: ffffffff81604418
R10: 0000000000000686 R11: ffffffff81615ef4 R12: 0000000000000010
R13: ffff88007f897000 R14: ffff88006e448000 R15: ffffffff81783910
FS:  00007f6c54ad27b0(0000) GS:ffffffff817bb000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007feabfce3600 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
 [<ffffffff8101dea6>] ? mcheck_check_cpu+0x0/0x36
 [<ffffffff8101d83e>] ? mcheck_timer+0x0/0x84
 [<ffffffff8106c6ae>] ? smp_call_function+0x22/0x26
 [<ffffffff8104ebfa>] ? on_each_cpu+0x1d/0x4b
 [<ffffffff8101d85f>] ? mcheck_timer+0x21/0x84
 [<ffffffff81059f6a>] ? run_workqueue+0xa7/0x14a
 [<ffffffff8105a0f9>] ? worker_thread+0xec/0xfd
 [<ffffffff8105dbab>] ? autoremove_wake_function+0x0/0x39
 [<ffffffff8105a00d>] ? worker_thread+0x0/0xfd
 [<ffffffff8105a00d>] ? worker_thread+0x0/0xfd
 [<ffffffff8105d815>] ? kthread+0x4d/0x78
 [<ffffffff810126ca>] ? child_rip+0xa/0x20
 [<ffffffff81011fe7>] ? restore_args+0x0/0x30
 [<ffffffff8105d7c8>] ? kthread+0x0/0x78
 [<ffffffff810126c0>] ? child_rip+0x0/0x20


Version-Release number of selected component (if applicable):
kernel-2.6.29.1-111.fc11.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Steven Drinnan 2009-05-06 10:27:46 UTC
Have a same problem in FC10. Happens when I shut down. 

Celeron 2.53 AsRock Board

Comment 2 Michal Pomorski 2009-05-19 00:48:37 UTC
Maybe you should remove the tainting module so that kernel developers would even look at this. Or maybe doing so will make the error disappear, in which case file the bug to Nvidia.

Comment 3 Steven Drinnan 2009-05-19 01:01:02 UTC
I would like to know witch one as the error does not tell me what module. More than one is specified. Any ideas?


Steven

Comment 4 Michal Pomorski 2009-05-19 09:42:24 UTC
Of course the error won't tell you what module is causing it. That would be too easy would it?  The tainting module in the example is the proprietary nvidia driver, marked as nvidia(P):

Modules linked in: vfat [...] nvidia(P) snd_emu10k1_synth [...] [last unloaded: scsi_wait_scan]

Pid: 16, comm: events/0 Tainted: P         

Funny thing, the OP has both nouveau and nvidia linked in. Anybody knows if thats okay?

Comment 5 Steven Drinnan 2009-05-20 02:12:13 UTC
Ok true true :)

But what should I do to check. 

What is the recommended procedure to find out which is the problem module.

I am willing to put in the time if someone can tell me what to do.

Comment 6 Michal Pomorski 2009-05-21 15:23:49 UTC
"What is the recommended procedure to find out which is the problem module."

Become a kernel developer and learn about the linux code so you can debug it using the back trace of the kernel failure reports.

Until then check YOUR failure report ( I thought you already did that ) for any tainted modules and unload them with 'modprobe -r modulename' as root. 

Note that tainted modules are not necessarily the cause of problems. Just more often than not. And if they are, there is nothing a kernel developer can do about it since they are proprietary or not designed to be used the way they were. Trying to debug errors that may originate in a proprietary module is a waste of time.
Thats why you shouldn't expect help from the kernel developers when a problem occurs when your kernel is tainted. No matter which module is causing it. 


Here is something on the different modes of tainting:
http://www.novell.com/support/viewContent.do?externalId=3582750&sliceId=1

Comment 7 Bug Zapper 2009-06-09 14:44:55 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 8 Matthew Gillen 2009-10-12 20:13:57 UTC
I got a similar starting message yesterday, but totally different stack trace.  I'm running Fedora 11 2.6.30.8-64.fc11.x86_64 on an Athlon(tm) 64 X2 Dual Core (untainted kernel):

Oct 11 17:21:07 kernel: BUG: soft lockup - CPU#1 stuck for 61s! [crond:27277]
Oct 11 17:21:07 kernel: Modules linked in: tun nfsd lockd nfs_acl auth_rpcgss fuse sco bridge stp llc bnep l2cap bluetooth sunrpc it87 hwmon_vid ipt_MASQUERADE iptable_nat nf_nat ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 cpufreq_ondemand powernow_k8 freq_table xfs exportfs dm_multipath uinput ata_generic ppdev snd_hda_codec_realtek pata_acpi 8139too 8139cp k8temp hwmon serio_raw pcspkr firewire_ohci mii firewire_core snd_hda_intel usb_storage snd_hda_codec crc_itu_t snd_hwdep snd_pcm snd_timer snd i2c_nforce2 parport_pc soundcore parport snd_page_alloc floppy pata_amd forcedeth nouveau drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
Oct 11 17:21:07 kernel: CPU 1:
Oct 11 17:21:07 kernel: Modules linked in: tun nfsd lockd nfs_acl auth_rpcgss fuse sco bridge stp llc bnep l2cap bluetooth sunrpc it87 hwmon_vid ipt_MASQUERADE iptable_nat nf_nat ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 cpufreq_ondemand powernow_k8 freq_table xfs exportfs dm_multipath uinput ata_generic ppdev snd_hda_codec_realtek pata_acpi 8139too 8139cp k8temp hwmon serio_raw pcspkr firewire_ohci mii firewire_core snd_hda_intel usb_storage snd_hda_codec crc_itu_t snd_hwdep snd_pcm snd_timer snd i2c_nforce2 parport_pc soundcore parport snd_page_alloc floppy pata_amd forcedeth nouveau drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
Oct 11 17:21:07 kernel: Pid: 27277, comm: crond Not tainted 2.6.30.8-64.fc11.x86_64 #1 SN68S
Oct 11 17:21:07 kernel: RIP: 0010:[<ffffffff8141e31a>]  [<ffffffff8141e31a>] __inet_lookup_established+0x10d/0x1fc
Oct 11 17:21:07 kernel: RSP: 0018:ffff88000102ebd8  EFLAGS: 00000206
Oct 11 17:21:07 kernel: RAX: 000000000001eb05 RBX: ffff88000102ec38 RCX: ffffc200045ef050
Oct 11 17:21:07 kernel: RDX: 000000002e25eb05 RSI: 000000002d591b68 RDI: 000020000001eb05
Oct 11 17:21:07 kernel: RBP: ffffffff81012c43 R08: 00000000006f4aab R09: 0000000028c151f8
Oct 11 17:21:07 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff88000102eb50
Oct 11 17:21:07 kernel: R13: 00000000006f4aab R14: ffffffff819cc9e0 R15: 0000000004d389b2
Oct 11 17:21:07 kernel: FS:  00007f4520a93790(0000) GS:ffff88000102b000(0000) knlGS:0000000000000000
Oct 11 17:21:07 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 11 17:21:07 kernel: CR2: 00007f4520ab0000 CR3: 000000004dbda000 CR4: 00000000000006e0
Oct 11 17:21:07 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct 11 17:21:07 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Oct 11 17:21:07 kernel: Call Trace:
Oct 11 17:21:07 kernel: <IRQ>  [<ffffffff8141e267>] ? __inet_lookup_established+0x5a/0x1fc
Oct 11 17:21:07 kernel: [<ffffffff8105da3a>] ? local_bh_enable+0x25/0x3b
Oct 11 17:21:07 kernel: [<ffffffff81434df0>] ? __inet_lookup.clone.1+0x50/0x8e
Oct 11 17:21:07 kernel: [<ffffffff81435408>] ? tcp_v4_rcv+0x206/0x6e2
Oct 11 17:21:07 kernel: [<ffffffff81416732>] ? ip_local_deliver_finish+0x0/0x212
Oct 11 17:21:07 kernel: [<ffffffff81416887>] ? ip_local_deliver_finish+0x155/0x212
Oct 11 17:21:07 kernel: [<ffffffff814169ca>] ? ip_local_deliver+0x86/0xa4
Oct 11 17:21:07 kernel: [<ffffffff814040eb>] ? nf_hook_slow+0x7f/0xf4
Oct 11 17:21:07 kernel: [<ffffffff81415eba>] ? ip_rcv_finish+0x0/0x3df
Oct 11 17:21:07 kernel: [<ffffffff81416257>] ? ip_rcv_finish+0x39d/0x3df
Oct 11 17:21:07 kernel: [<ffffffff81416531>] ? ip_rcv+0x298/0x2ec
Oct 11 17:21:07 kernel: [<ffffffff813e2304>] ? netif_receive_skb+0x3c8/0x401
Oct 11 17:21:07 kernel: [<ffffffff813e2c4b>] ? net_rx_action+0x179/0x1df
Oct 11 17:21:07 kernel: [<ffffffff813e23e8>] ? process_backlog+0xab/0xfb
Oct 11 17:21:07 kernel: [<ffffffff8105dcca>] ? __do_softirq+0x1a4/0x1d2
Oct 11 17:21:07 kernel: [<ffffffff813e2b95>] ? net_rx_action+0xc3/0x1df
Oct 11 17:21:07 kernel: [<ffffffff8105dbf8>] ? __do_softirq+0xd2/0x1d2
Oct 11 17:21:07 kernel: [<ffffffff8101323c>] ? call_softirq+0x1c/0x30
Oct 11 17:21:07 kernel: <EOI>  [<ffffffff81014ba3>] ? do_softirq+0x5f/0xd7
Oct 11 17:21:07 kernel: [<ffffffff813e36b5>] ? dev_queue_xmit+0x30b/0x355
Oct 11 17:21:07 kernel: [<ffffffff8105d9a8>] ? _local_bh_enable_ip+0xd3/0x109
Oct 11 17:21:07 kernel: [<ffffffff8105da3a>] ? local_bh_enable+0x25/0x3b
Oct 11 17:21:07 kernel: [<ffffffff813e36b5>] ? dev_queue_xmit+0x30b/0x355
Oct 11 17:21:07 kernel: [<ffffffff8141a2b6>] ? ip_finish_output+0x0/0x98
Oct 11 17:21:07 kernel: [<ffffffff8141a260>] ? ip_finish_output2+0x1e0/0x236
Oct 11 17:21:07 kernel: [<ffffffff8141a338>] ? ip_finish_output+0x82/0x98
Oct 11 17:21:07 kernel: [<ffffffff8141a6c9>] ? ip_output+0xb0/0xcb
Oct 11 17:21:07 kernel: [<ffffffff81418df7>] ? dst_output+0x23/0x39
Oct 11 17:21:07 kernel: [<ffffffff8141a7bd>] ? ip_local_out+0x32/0x4d
Oct 11 17:21:07 kernel: [<ffffffff8141ae26>] ? ip_queue_xmit+0x2f7/0x368
Oct 11 17:21:07 kernel: [<ffffffff8142dd84>] ? tcp_transmit_skb+0x64f/0x6a3
Oct 11 17:21:07 kernel: [<ffffffff8142f8ab>] ? tcp_connect+0x37d/0x3fe
Oct 11 17:21:07 kernel: [<ffffffff81434c9d>] ? tcp_v4_connect+0x3e2/0x453
Oct 11 17:21:07 kernel: [<ffffffff8144321d>] ? inet_stream_connect+0xae/0x269
Oct 11 17:21:07 kernel: [<ffffffff813d147d>] ? sys_connect+0x95/0xd5
Oct 11 17:21:07 kernel: [<ffffffff810a6db4>] ? audit_syscall_entry+0x12d/0x16d
Oct 11 17:21:07 kernel: [<ffffffff81498ca4>] ? trace_hardirqs_on_thunk+0x3a/0x3c
Oct 11 17:21:07 kernel: [<ffffffff81012082>] ? system_call_fastpath+0x16/0x1b
Oct 11 17:22:12 kernel: BUG: soft lockup - CPU#1 stuck for 61s! [crond:27277]

Comment 9 Bug Zapper 2010-04-27 13:59:41 UTC
This message is a reminder that Fedora 11 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 11.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '11'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 11's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 11 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 10 Bug Zapper 2010-06-28 12:15:35 UTC
Fedora 11 changed to end-of-life (EOL) status on 2010-06-25. Fedora 11 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.