Description of problem: random crash Version-Release number of selected component (if applicable): kernel-2.6.35.14-95.fc14.x86_64 How reproducible: probably not, happened after a month of uptime Additional info: I have the kdump vmcore file for further analysis. ------------[ cut here ]------------ WARNING: at lib/list_debug.c:26 __list_add+0x3f/0x81() Hardware name: EX58-UD4 list_add corruption. next->prev should be prev (ffff8801a7f80cf8), but was ffff88015e4b97a0. (next=ffff8801a7f80cf8). Modules linked in: netconsole iptable_mangle nls_utf8 vfat fat usb_storage ip6table_filter ip6_tables ipt_MASQUERADE iptable_nat nf_nat usblp hidp fuse configfs rfcomm sco bnep l2cap nfsd lockd nfs_acl auth_rpcgss exportfs it87 hwmon_vid coretemp sunrpc tun cpufreq_ondemand acpi_cpufreq freq_table mperf ip6t_REJECT nf_conntrack_ipv6 ipv6 kvm_intel kvm uinput snd_hda_intel snd_usb_audio snd_usbmidi_lib snd_hda_codec snd_rawmidi snd_hwdep uvcvideo snd_seq snd_seq_device snd_pcm btusb snd_timer snd_page_alloc videodev v4l2_compat_ioctl32 i2c_i801 snd i7core_edac bluetooth edac_core soundcore r8169 mii iTCO_wdt iTCO_vendor_support microcode rfkill sha256_generic cryptd aes_x86_64 aes_generic cbc dm_crypt raid1 firewire_ohci firewire_core crc_itu_t radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: netconsole] Pid: 646, comm: kcryptd Not tainted 2.6.35.14-95.fc14.x86_64 #1 Call Trace: [<ffffffff8104dd31>] warn_slowpath_common+0x85/0x9d [<ffffffff8104ddec>] warn_slowpath_fmt+0x46/0x48 [<ffffffff812272ca>] __list_add+0x3f/0x81 [<ffffffff8103da85>] list_add+0x11/0x13 [<ffffffff81040fed>] enqueue_entity+0x89/0x2e8 [<ffffffff810414f5>] enqueue_task_fair+0x2a/0x48 [<ffffffff81042f7d>] enqueue_task+0x5d/0x6d [<ffffffff81042fba>] activate_task+0x2d/0x36 [<ffffffff81046cf4>] try_to_wake_up+0x1f8/0x2c5 [<ffffffff81046dd3>] default_wake_function+0x12/0x14 [<ffffffff81066aa1>] autoremove_wake_function+0x16/0x39 [<ffffffff81066af3>] wake_bit_function+0x2f/0x31 [<ffffffff81039ddf>] __wake_up_common+0x4e/0x84 [<ffffffff8103d147>] __wake_up+0x39/0x4d [<ffffffff81066a5f>] __wake_up_bit+0x31/0x33 [<ffffffff810d418f>] unlock_page+0x27/0x2c [<ffffffff811404b6>] mpage_end_io_read+0x65/0x7d [<ffffffff8113b8ca>] bio_endio+0x2b/0x2d [<ffffffff81382cac>] dec_pending+0x153/0x15c [<ffffffff81382e6d>] clone_endio+0xaa/0xb7 [<ffffffff8113b8ca>] bio_endio+0x2b/0x2d [<ffffffffa0121775>] crypt_dec_pending+0x5e/0x8c [dm_crypt] [<ffffffffa0122a89>] kcryptd_crypt+0x3fa/0x45d [dm_crypt] [<ffffffff8146b70f>] ? _raw_spin_unlock_irqrestore+0x17/0x19 [<ffffffff810bf8d3>] ? probe_workqueue_execution+0xb1/0xcd [<ffffffff81062c2d>] worker_thread+0x1c5/0x251 [<ffffffffa012268f>] ? kcryptd_crypt+0x0/0x45d [dm_crypt] [<ffffffff81066a8b>] ? autoremove_wake_function+0x0/0x39 [<ffffffff81062a68>] ? worker_thread+0x0/0x251 [<ffffffff810665f1>] kthread+0x7f/0x87 [<ffffffff8100aaa4>] kernel_thread_helper+0x4/0x10 [<ffffffff81066572>] ? kthread+0x0/0x87 [<ffffffff8100aaa0>] ? kernel_thread_helper+0x0/0x10 ---[ end trace 39e7b2d4e2073c82 ]--- BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 IP: [<ffffffff81040c70>] pick_next_task_fair+0x89/0x146 PGD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map CPU 2 Modules linked in: netconsole iptable_mangle nls_utf8 vfat fat usb_storage ip6table_filter ip6_tables ipt_MASQUERADE iptable_nat nf_nat usblp hidp fuse configfs rfcomm sco bnep l2cap nfsd lockd nfs_acl auth_rpcgss exportfs it87 hwmon_vid coretemp sunrpc tun cpufreq_ondemand acpi_cpufreq freq_table mperf ip6t_REJECT nf_conntrack_ipv6 ipv6 kvm_intel kvm uinput snd_hda_intel snd_usb_audio snd_usbmidi_lib snd_hda_codec snd_rawmidi snd_hwdep uvcvideo snd_seq snd_seq_device snd_pcm btusb snd_timer snd_page_alloc videodev v4l2_compat_ioctl32 i2c_i801 snd i7core_edac bluetooth edac_core soundcore r8169 mii iTCO_wdt iTCO_vendor_support microcode rfkill sha256_generic cryptd aes_x86_64 aes_generic cbc dm_crypt raid1 firewire_ohci firewire_core crc_itu_t radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: netconsole] Pid: 0, comm: swapper Tainted: G W 2.6.35.14-95.fc14.x86_64 #1 EX58-UD4/EX58-UD4 RIP: 0010:[<ffffffff81040c70>] [<ffffffff81040c70>] pick_next_task_fair+0x89/0x146 RSP: 0018:ffff8801a8d3bdd8 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000020170f41 RDX: ffff88000a295540 RSI: ffff88000a2955d8 RDI: 0000000000000000 RBP: ffff8801a8d3be18 R08: 0000000000000000 R09: ffff8801a8d3bec8 R10: 0007edd2641c2e09 R11: ffffffff81b81f60 R12: ffff8801a7f80cc0 R13: 0000000000000000 R14: ffff88000a295540 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88000a280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000038 CR3: 0000000128781000 CR4: 00000000000006e0 DR0: 0000000008049bd4 DR1: 00000000080498e0 DR2: 000000000804a2b8 DR3: 000000000804a2bc DR6: 00000000ffff0ff0 DR7: 0000000000000600 Process swapper (pid: 0, threadinfo ffff8801a8d3a000, task ffff8801a8d32e80) Stack: 0007edd264912b08 ffffffff817ad7de ffff8801a8d3be18 ffff88000a295540 <0> ffffffff81b81f60 ffff8801a8d33238 0000000000000000 0000000000000002 <0> ffff8801a8d3be38 ffffffff81040d57 ffff88000a295540 ffffffff81b81f60 Call Trace: [<ffffffff81040d57>] pick_next_task+0x2a/0x49 [<ffffffff81469d96>] schedule+0x2c9/0x5c0 [<ffffffff810732d4>] ? hrtimer_start_expires.clone.5+0x1e/0x20 [<ffffffff8100832b>] cpu_idle+0xca/0xcc [<ffffffff81464500>] start_secondary+0x24d/0x28e Code: 8b 5c 24 58 49 8b 44 24 60 48 85 c0 74 15 48 8b 78 50 4c 89 ee e8 4c fa ff ff 85 c0 7f 05 49 8b 5c 24 60 48 89 df e8 d4 f5 ff ff <83> 7b 38 00 74 32 49 8d 7c 24 70 48 89 de 4c 8d 6b 10 e8 5a fc RIP [<ffffffff81040c70>] pick_next_task_fair+0x89/0x146 RSP <ffff8801a8d3bdd8> CR2: 0000000000000038
this part of the scheduler has seen extensive rewriting since 2.6.35. There's no obvious fix for this particular bug, which makes backporting complicated. Given the late stage of f14 (we're not rebasing it again), and that this isn't easily reproducible, I think we're best off just closing this out. For f15 onwards, we're going back to aggressively rebasing, so we should be able to track such bugs a lot easier.