Created attachment 478985 [details] Photo of the kernel dump on crash Description of problem: On hibernate, the kernel crashed. See attached image. Version-Release number of selected component (if applicable): kernel-2.6.35.11-83.fc14.x86_64 How reproducible: Seen only once thus far. Steps to Reproduce: 1. Supend/resume and hibernate/thaw over a number of days. 2. Hibernate. 3. Crash. Actual results: Kernel crashes. Expected results: Should not crash. Additional info: http://www.smolts.org/client/show/pub_ebd16c9b-ba21-4d39-964a-cfd361713146
Oh, and if it matters, root fs (which is ext4) had quite a bit of junk left over after the crash. So much so that automatic fsck wouldn't run, but a manual one was required instead. Just FYI.
Probably related, heppened after the third hibernate/thaw cycle (just wanted to see whether this is still a problem with 2.6.40 kernel): ----------------------------- [12128.384647] kernel BUG at fs/inode.c:432! [12128.384649] invalid opcode: 0000 [#1] SMP [12128.384651] CPU 2 [12128.384652] Modules linked in: tcp_lp ppdev parport_pc lp parport sunrpc cpufreq_ondemand acpi_cpufreq mperf rfcomm bnep snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm btusb bluetooth e1000e arc4 qcserial usb_wwan uvcvideo iTCO_wdt intel_ips videodev media snd_timer thinkpad_acpi v4l2_compat_ioctl32 iwlagn i2c_i801 iTCO_vendor_support mac80211 cfg80211 snd_page_alloc rfkill microcode snd soundcore joydev ipv6 sdhci_pci firewire_ohci sdhci mmc_core firewire_core crc_itu_t mxm_wmi wmi i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan] [12128.384685] [12128.384688] Pid: 997, comm: irqbalance Tainted: G W 2.6.40.4-5.fc15.x86_64 #1 LENOVO 4313CTO/4313CTO [12128.384692] RIP: 0010:[<ffffffff8113a9f2>] [<ffffffff8113a9f2>] end_writeback+0x3c/0xa0 [12128.384700] RSP: 0018:ffff88021acd3df8 EFLAGS: 00010006 [12128.384701] RAX: 0000000000000004 RBX: ffff88022d312c58 RCX: 0000000000000000 [12128.384703] RDX: 0000000000000004 RSI: 00000000000001a9 RDI: ffff88022d312dc0 [12128.384705] RBP: ffff88021acd3e08 R08: ffff88022d312ce8 R09: ffff88022d312ce8 [12128.384707] R10: ffff88021acd3d88 R11: ffff8801af316c00 R12: ffff88022d312ce8 [12128.384709] R13: ffffffff81618c10 R14: ffff8801af316cb0 R15: ffff8801ef1d8900 [12128.384711] FS: 00007fbe23774740(0000) GS:ffff88023bd00000(0000) knlGS:0000000000000000 [12128.384714] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [12128.384715] CR2: 00007fbe23791000 CR3: 000000022b0e8000 CR4: 00000000000006e0 [12128.384718] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [12128.384720] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [12128.384723] Process irqbalance (pid: 997, threadinfo ffff88021acd2000, task ffff88021b7e4590) [12128.384724] Stack: [12128.384726] 0000000000000000 ffff88022d312c58 ffff88021acd3e28 ffffffff81171d04 [12128.384729] ffff88023b80d0c0 ffff88022d312c58 ffff88021acd3e58 ffffffff8113aacd [12128.384733] 000000000000f886 ffff88022d312c58 ffff88023b80d000 ffffffff81618c10 [12128.384737] Call Trace: [12128.384745] [<ffffffff81171d04>] proc_evict_inode+0x24/0x6d [12128.384748] [<ffffffff8113aacd>] evict+0x77/0x117 [12128.384750] [<ffffffff8113acdd>] iput+0x130/0x138 [12128.384755] [<ffffffff81137bfe>] dentry_kill+0x104/0x121 [12128.384757] [<ffffffff81138104>] dput+0xdd/0xea [12128.384761] [<ffffffff811286e8>] fput+0x1cb/0x1e3 [12128.384767] [<ffffffff81125ae7>] filp_close+0x6e/0x7a [12128.384769] [<ffffffff81125b90>] sys_close+0x9d/0xda [12128.384776] [<ffffffff8148e842>] system_call_fastpath+0x16/0x1b [12128.384778] Code: 00 48 89 fb 48 c7 c7 b9 34 7c 81 e8 2b ca f0 ff e8 3a c1 34 00 48 8d bb 68 01 00 00 e8 97 d6 34 00 48 83 bb b0 01 00 00 00 74 02 <0f> 0b 66 ff 83 68 01 00 00 fb 66 66 90 66 66 90 48 8d 83 e0 01 [12128.385004] RIP [<ffffffff8113a9f2>] end_writeback+0x3c/0xa0 [12128.385008] RSP <ffff88021acd3df8> [12128.385011] ---[ end trace b2d9ab44b8eb6777 ]--- [12128.385015] BUG: sleeping function called from invalid context at kernel/rwsem.c:21 [12128.385017] in_atomic(): 0, irqs_disabled(): 1, pid: 997, name: irqbalance [12128.385020] Pid: 997, comm: irqbalance Tainted: G D W 2.6.40.4-5.fc15.x86_64 #1 [12128.385021] Call Trace: [12128.385027] [<ffffffff810474ed>] __might_sleep+0xeb/0xf0 [12128.385032] [<ffffffff81487727>] down_read+0x21/0x38 [12128.385037] [<ffffffff8108e805>] acct_collect+0x4a/0x182 [12128.385042] [<ffffffff8105823c>] do_exit+0x253/0x77b [12128.385046] [<ffffffff814880b4>] ? _raw_spin_unlock_irqrestore+0x17/0x19 [12128.385050] [<ffffffff8148924e>] oops_end+0xbc/0xc5 [12128.385054] [<ffffffff8100bea3>] die+0x5a/0x63 [12128.385056] [<ffffffff81488b48>] do_trap+0x121/0x130 [12128.385059] [<ffffffff81009b4a>] do_invalid_op+0x94/0x9d [12128.385061] [<ffffffff8113a9f2>] ? end_writeback+0x3c/0xa0 [12128.385065] [<ffffffff810e4cf8>] ? pagevec_lookup+0x20/0x2a [12128.385068] [<ffffffff8148f7db>] invalid_op+0x1b/0x20 [12128.385071] [<ffffffff8113a9f2>] ? end_writeback+0x3c/0xa0 [12128.385073] [<ffffffff8113a9e8>] ? end_writeback+0x32/0xa0 [12128.385076] [<ffffffff81171d04>] proc_evict_inode+0x24/0x6d [12128.385078] [<ffffffff8113aacd>] evict+0x77/0x117 [12128.385080] [<ffffffff8113acdd>] iput+0x130/0x138 [12128.385083] [<ffffffff81137bfe>] dentry_kill+0x104/0x121 [12128.385085] [<ffffffff81138104>] dput+0xdd/0xea [12128.385088] [<ffffffff811286e8>] fput+0x1cb/0x1e3 [12128.385091] [<ffffffff81125ae7>] filp_close+0x6e/0x7a [12128.385093] [<ffffffff81125b90>] sys_close+0x9d/0xda [12128.385096] [<ffffffff8148e842>] system_call_fastpath+0x16/0x1b -----------------------------
Which was the followed by a panic on shutdown. Unfortunately, my camera didn't take a very good photo of it :-(
[Mass hibernate bug update] Dave Airlied has found an issue causing some corruption in the i915 fbdev after a resume from hibernate. I have included his patch in this scratch build: http://koji.fedoraproject.org/koji/taskinfo?taskID=3940545 This will probably not solve all of the issues being tracked at the moment, but it is worth testing when the build completes. If this seems to clear up the issues you see with hibernate, please report your results in the bug.
Just did 108 hibernate/thaw cycles with this kernel on my ThinkPad T510 and writing this from that session. The only trouble was this: ------------------ Mar 29 18:58:56 shrek kernel: [ 1015.158060] modem-manager[26358]: segfault at 44 ip 000000000042edba sp 00007fffe4939ba0 error 4 in modem-manager[400000+55000] ------------------ NetworkManager got pretty confused, generally, so my LAN was knocked out. Fix was to disable eth0 and then stop/start NM. Not a big deal and unrelated to the i915 problem. Anyhow, this is the fix for the i915 issue, pretty sure of that. So, many, many thanks to Dave and everyone that participated in fixing this. Much appreciated!
Oh, one more comment. Occasionally, ext4 would report that it deleted 3 orphaned inodes. No FS corruption tough, at least none that I can see.