Description of problem: Desktop dies. Unable to switch to text mode. Version-Release number of selected component (if applicable): kernel-2.6.25-0.218.rc8.git7.fc9.x86_64 How reproducible: Happens overnight or if the system is left idle for a few hours. Steps to Reproduce: 1. Leave idle 2. Verify operation Actual results: Desktop hangs eventually. Error message is recorded. Expected results: No hang. Additional info: Looks like it started just very recently. I'm pretty sure the kernel-2.6.25-0.195.rc8.git1.fc9.x86_64 is ok (from reading saved /var/log/messages). I am unable to get save dmesg, but there's /var/log/messages. Here's an excerpt (will attach a complete one): Apr 15 19:37:43 niphredil kernel: BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0] Apr 15 19:37:43 niphredil kernel: CPU 0: Apr 15 19:37:43 niphredil kernel: Modules linked in: tun ipt_MASQUERADE iptable_nat nf_nat bridge ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi scsi_transport_iscsi nf_conntrack_netbios_ns nf_conntrack_ipv4 xt_state nf_conntrack ipt_REJECT iptable_filter ip_tables xt_tcpudp ip6t_REJECT ip6table_filter ip6_tables x_tables ipv6 cpufreq_ondemand powernow_k8 freq_table kvm_amd kvm arc4 ecb crypto_blkcipher b43 rfkill snd_usb_audio mac80211 snd_usb_lib snd_rawmidi cfg80211 input_polldev snd_hda_intel snd_seq_dummy dcdbas snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm sdhci snd_timer joydev mmc_core b44 ricoh_mmc snd_page_alloc snd_hwdep mii snd k8temp hwmon soundcore i2c_piix4 i2c_core ssb shpchp video sg output wmi battery ac button sr_mod cdrom pata_atiixp dm_snapshot dm_zero dm_mirror dm_mod ahci libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: pcspkr] Apr 15 19:37:43 niphredil kernel: Pid: 0, comm: swapper Not tainted 2.6.25-0.218.rc8.git7.fc9.x86_64 #1 Apr 15 19:37:43 niphredil kernel: RIP: 0010:[_spin_unlock_irqrestore+8/10] [_spin_unlock_irqrestore+8/10] _spin_unlock_irqrestore+0x8/0xa Apr 15 19:37:43 niphredil kernel: RSP: 0018:ffffffff81455d98 EFLAGS: 00000293 Apr 15 19:37:43 niphredil kernel: RAX: 0000000000000000 RBX: ffffffff81455d98 RCX: ffffffff81455d98 Apr 15 19:37:43 niphredil kernel: RDX: 00001ec2439ee80e RSI: 0000000000000293 RDI: ffffffff81504220 Apr 15 19:37:43 niphredil kernel: RBP: ffffffff81455d28 R08: ffff8100010045b0 R09: 0000000000a68a32 Apr 15 19:37:43 niphredil kernel: R10: ffff81000100bf80 R11: ffffffff81455e98 R12: ffffffff810490f3 Apr 15 19:37:43 niphredil kernel: R13: ffffffff81455d18 R14: ffff8100010045b0 R15: 00000f9a74dc7969 Apr 15 19:37:43 niphredil kernel: FS: 00007f9bdbef1700(0000) GS:ffffffff813f2000(0000) knlGS:00000000f7f24940 Apr 15 19:37:43 niphredil kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b Apr 15 19:37:43 niphredil kernel: CR2: 00007fe126652000 CR3: 0000000000201000 CR4: 00000000000006a0 Apr 15 19:37:43 niphredil kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 15 19:37:43 niphredil kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Apr 15 19:37:43 niphredil kernel: Apr 15 19:37:43 niphredil kernel: Call Trace: Apr 15 19:37:43 niphredil kernel: [tick_broadcast_oneshot_control+230/239] ? tick_broadcast_oneshot_control+0xe6/0xef Apr 15 19:37:43 niphredil kernel: [tick_notify+482/821] ? tick_notify+0x1e2/0x335 Apr 15 19:37:43 niphredil kernel: [notifier_call_chain+51/91] ? notifier_call_chain+0x33/0x5b Apr 15 19:37:43 niphredil kernel: [raw_notifier_call_chain+15/17] ? raw_notifier_call_chain+0xf/0x11 Apr 15 19:37:43 niphredil kernel: [clockevents_notify+43/92] ? clockevents_notify+0x2b/0x5c Apr 15 19:37:43 niphredil kernel: [acpi_state_timer_broadcast+65/67] ? acpi_state_timer_broadcast+0x41/0x43 Apr 15 19:37:43 niphredil kernel: [acpi_idle_enter_bm+776/885] ? acpi_idle_enter_bm+0x308/0x375 Apr 15 19:37:43 niphredil kernel: [menu_select+111/143] ? menu_select+0x6f/0x8f Apr 15 19:37:43 niphredil kernel: [cpuidle_idle_call+134/186] ? cpuidle_idle_call+0x86/0xba Apr 15 19:37:43 niphredil kernel: [cpuidle_idle_call+0/186] ? cpuidle_idle_call+0x0/0xba Apr 15 19:37:43 niphredil kernel: [default_idle+0/95] ? default_idle+0x0/0x5f Apr 15 19:37:43 niphredil kernel: [cpu_idle+160/232] ? cpu_idle+0xa0/0xe8 Apr 15 19:37:43 niphredil kernel: [rest_init+90/92] ? rest_init+0x5a/0x5c Apr 15 19:37:43 niphredil kernel:
Created attachment 302761 [details] /var/log/messages (unedited)
This looks similar to what I am seeing in my multicpu kvm guests now... so perhaps they are fine, and this is a more general problem? See https://bugzilla.redhat.com/show_bug.cgi?id=438617
In my case there's no KVM and/or Xen. Only the final trace in bug 438617 originated in an idle state, and there was no ACPI involved. I filed this one because it looked different to me. It's easier to dup bugs than to clone them anyway.
davej reported what looks like the same thing in bug 444059 and there are some similar reports for F8. Can you try booting with 'processor.max_cstate=1'?
Changing version to '9' as part of upcoming Fedora 9 GA. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Pete, does that have an ATI chipset by any chance ? If it's the same bug I saw, it's fixed by 'something' in .26-rc, but I've no idea which changeset, as there's so many of them, and the bug takes a while to reproduce, which makes bisecting difficult.
Created attachment 308630 [details] version, messages log and version information I am seeing the same problem with the last kernel i386 Fedora 8. The attachment contains my kernel information and messages log. I have had this happen twice right after running Snort.
I've been seeing this bug for a while too, with both F8 and F9. It's certainly been there since 2.6.24, and is still present (though apparently not as bad) with the 2.6.25.4-30.fc9.x86_64 kernel. This machine (Ferrari 4000 laptop) has an ATI chipset (RS480 aka 200M). I've never been able to recreate it with any reliability (beyond "it eventually happens") but it seems to be more easily triggerable when the wireless card (p54pci) has the RFKill switch on and the 802.11+ stack trying to scan/find something in the background. Still, your note that "something fixed it in 2.6.26-rc" is encouraging.
I confirm the same bug on F8 with the latest kernel. It is now happening too frequently; whenever I leave my computer idle overnight, I find it hanged in the morning with this message.
Created attachment 317045 [details] System log of a similar problem More or less the same as already reported. Something to do with BIND (i.e. named). Note that this was just after unsuccessful attempt to create and IPSec tunnel.
Please note, the attachment from comment #10 is from an i686 machine, so this is not just x86_64 specific. Kernel is: 2.6.26.3-29.fc9.i686.
Created attachment 317642 [details] Output of lspci -vv and lspci -nn (Shuttle K45)
(In reply to comment #10) > Created an attachment (id=317045) [details] > System log of the similar problem > > More or less the same as already reported. Something to do with BIND (i.e. > named). Note that this was just after unsuccessful attempt to create and IPSec > tunnel. That is not even close to being the same problem as the original report. The original was a lockup in the timer code, while this one is a lockup in the IPsec code.
(In reply to comment #7) > Created an attachment (id=308630) [details] > version, messages log and version information > > I am seeing the same problem with the last kernel i386 Fedora 8. The > attachment contains my kernel information and messages log. I have had this > happen twice right after running Snort. Also not the "same problem". This is a lockup in the wireless code.
Closing this bug. Anyone still having problems should open a separate bug report and attach information about their lockup to that.