Created attachment 440124 [details] Output of 'dmesg' showing 'suspicious rcu_dereference_check() usage.' Description of problem: Noticed this in dmesg. Complete dmesg attached. initcall init_trace_printk+0x0/0x12 returned 0 after 0 usecs lockdep: fixing up alternatives. =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- kernel/sched.c:618 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 3 locks held by kworker/0:0/4: #0: (events){+.+.+.}, at: [<ffffffff81067e11>] process_one_work+0x160/0x2ec #1: ((&c_idle.work)){+.+.+.}, at: [<ffffffff81067e11>] process_one_work+0x160/0x2ec #2: (&rq->lock){-.....}, at: [<ffffffff81490279>] init_idle+0x30/0x131 stack backtrace: Pid: 4, comm: kworker/0:0 Not tainted 2.6.36-0.6.rc1.git3.fc15.x86_64 #1 Call Trace: [<ffffffff8107d944>] lockdep_rcu_dereference+0xaa/0xb3 [<ffffffff810401ce>] task_group+0x80/0x8f [<ffffffff810401f4>] set_task_rq+0x17/0x73 [<ffffffff81490333>] init_idle+0xea/0x131 [<ffffffff81490703>] fork_idle+0x92/0xa3 [<ffffffff8101050d>] ? sched_clock+0x9/0xd [<ffffffff8148e17c>] do_fork_idle+0x1c/0x2d [<ffffffff81067e6d>] process_one_work+0x1bc/0x2ec [<ffffffff81067e11>] ? process_one_work+0x160/0x2ec [<ffffffff8148e160>] ? do_fork_idle+0x0/0x2d [<ffffffff81068d6c>] ? manage_workers.clone.9+0xe0/0x173 [<ffffffff81068f03>] worker_thread+0x104/0x19b [<ffffffff81068dff>] ? worker_thread+0x0/0x19b [<ffffffff8106c84c>] kthread+0x9d/0xa5 [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10 [<ffffffff81498150>] ? restore_args+0x0/0x30 [<ffffffff8106c7af>] ? kthread+0x0/0xa5 [<ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10 Booting Node 0, Processors #1 NMI watchdog enabled, takes one hw-pmu counter. Brought up 2 CPUs Total of 2 processors activated (9575.89 BogoMIPS). sizeof(vma)=176 bytes sizeof(page)=56 bytes sizeof(inode)=1000 bytes sizeof(dentry)=256 bytes sizeof(ext3inode)=1432 bytes sizeof(buffer_head)=104 bytes sizeof(skbuff)=240 bytes sizeof(task_struct)=9200 bytes devtmpfs: initialized calling init_mmap_min_addr+0x0/0xd @ 1 initcall init_mmap_min_addr+0x0/0xd returned 0 after 0 usecs Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Oops. Quick trigger finger... Version-Release: kernel-2.6.36-0.6.rc1.git3.fc15.x86_64
I just installed fedora f14 alpha x86_64 and applied updates, and I also see the same error in my dmesg log with kernel 2.6.35.2-9.fc14.x86_64 Here's the log fragment: CPU0: Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz stepping 0a lockdep: fixing up alternatives. =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- kernel/sched.c:616 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 3 locks held by swapper/1: #0: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81052b6b>] cpu_maps_update_begin+0x17/0x19 #1: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81052a7e>] cpu_hotplug_begin+0x2c/0x53 #2: (&rq->lock){-.....}, at: [<ffffffff81491c05>] init_idle+0x30/0x131 stack backtrace: Pid: 1, comm: swapper Not tainted 2.6.35.2-9.fc14.x86_64 #1 Call Trace: [<ffffffff8107bca2>] lockdep_rcu_dereference+0xaa/0xb3 [<ffffffff8103fdc5>] task_group+0x80/0x8f [<ffffffff8103fdeb>] set_task_rq+0x17/0x73 [<ffffffff81491cbf>] init_idle+0xea/0x131 [<ffffffff8149208f>] fork_idle+0x92/0xa3 [<ffffffff8107e788>] ? mark_held_locks+0x50/0x72 [<ffffffff8148fb30>] do_fork_idle+0x1c/0x2d [<ffffffff8148fc78>] do_boot_cpu+0x137/0x9b1 [<ffffffff8148fb14>] ? do_fork_idle+0x0/0x2d [<ffffffff81490d16>] native_cpu_up+0x100/0x1c2 [<ffffffff81492168>] _cpu_up+0x9d/0xf9 [<ffffffff81492297>] cpu_up+0xd3/0xe5 [<ffffffff81d78d86>] kernel_init+0x105/0x2c9 [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10 [<ffffffff81499450>] ? restore_args+0x0/0x30 [<ffffffff81d78c81>] ? kernel_init+0x0/0x2c9 [<ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10 Booting Node 0, Processors #1lockdep: fixing up alternatives. #2lockdep: fixing up alternatives. #3 Ok. Brought up 4 CPUs Total of 4 processors activated (22588.67 BogoMIPS). sizeof(vma)=176 bytes sizeof(page)=56 bytes sizeof(inode)=1168 bytes sizeof(dentry)=256 bytes sizeof(ext3inode)=1600 bytes sizeof(buffer_head)=104 bytes sizeof(skbuff)=240 bytes sizeof(task_struct)=9216 bytes devtmpfs: initialized atomic64 test passed for x86-64 platform with CX8 and with SSE
In case anyone is interested, I just got the new 2.6.35.4-12.fc14.x86_64 kernel update, and I still see the same problem at boot.
This mysterious error still scrolls past when booting the x86_64 Beta TC1 iso image.
The question for this bug is whether the issue is specific to your hardware, or generic enough for most users. I *suspect* this would be covered under the following Final release criteria, but this would be contingent upon how common this failure is for other users: "In most cases, there must be no SELinux 'AVC: denied' messages or abrt crash notifications on initial boot and subsequent login" Out of curiousity, does ABRT detect and notify you of the kernel crash?
I do not receive notification from ABRT.
Just got a kernel update to kernel-2.6.35.4-28.fc14.x86_64, I still see a similar INFO message in /var/log/dmesg, but the details of the walkback are quite different: ... atomic64 test passed for x86-64 platform with CX8 and with SSE Time: 13:56:59 Date: 09/18/10 NET: Registered protocol family 16 =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- include/linux/cgroup.h:542 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 1 lock held by swapper/1: #0: (net_mutex){+.+.+.}, at: [<ffffffff813e9010>] register_pernet_subsys+0x1f/0x47 stack backtrace: Pid: 1, comm: swapper Not tainted 2.6.35.4-28.fc14.x86_64 #1 Call Trace: [<ffffffff8107bd3a>] lockdep_rcu_dereference+0xaa/0xb3 [<ffffffff813e04b9>] sock_update_classid+0x7c/0xa2 [<ffffffff813e054a>] sk_alloc+0x6b/0x77 [<ffffffff8140b281>] __netlink_create+0x37/0xab [<ffffffff813f941c>] ? rtnetlink_rcv+0x0/0x2d [<ffffffff8140cee1>] netlink_kernel_create+0x74/0x19d [<ffffffff8149c3ca>] ? __mutex_lock_common+0x339/0x35b [<ffffffff813f7e9c>] rtnetlink_net_init+0x2e/0x48 [<ffffffff813e8d7a>] ops_init+0xe9/0xff [<ffffffff813e8f0d>] register_pernet_operations+0xab/0x130 [<ffffffff813e901f>] register_pernet_subsys+0x2e/0x47 [<ffffffff81db7bca>] rtnetlink_init+0x53/0x102 [<ffffffff81db835c>] netlink_proto_init+0x126/0x143 [<ffffffff81db8236>] ? netlink_proto_init+0x0/0x143 [<ffffffff810021b8>] do_one_initcall+0x72/0x186 [<ffffffff81d78ebc>] kernel_init+0x23b/0x2c9 [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10 [<ffffffff8149e2d0>] ? restore_args+0x0/0x30 [<ffffffff81d78c81>] ? kernel_init+0x0/0x2c9 [<ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10 ACPI: bus type pci registered PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0xe0000000-0xefffffff] (base 0xe0000000) PCI: not using MMCONFIG PCI: Using configuration type 1 for base access bio: create slab <bio-0> at 0 ... In case hardware info for this machine is useful, the smolt profile is here: http://www.smolts.org/client/show/pub_36882a52-8a36-46bd-920d-17d1a49dbfda
I am seeing this traceback on all recent rawhide and upstream kernels. Lenovo T60p laptop system. Poking into it a little tonight.
(In reply to comment #7) > Just got a kernel update to kernel-2.6.35.4-28.fc14.x86_64, I still see > a similar INFO message in /var/log/dmesg, but the details of the > walkback are quite different: > You are seeing bug 572520.
Duping to 622149 *** This bug has been marked as a duplicate of bug 622149 ***
Reopening. I didn't realize that this bug is a F14blocker. Sorry for the spam.
*** Bug 622149 has been marked as a duplicate of this bug. ***
Created attachment 448807 [details] dmesg
Why is this a blocker?
I don't think it should be a blocker. It's not causing crashes directly that I know of.
------- Comment From Subrata Modak subrata.ibm.com 2010-10-04 02:11 EDT------- This is still seen with Fedora 14 Beta: node 0 link 0: io port [1000, 4fff] Regards-- Subrata
Created attachment 452016 [details] Proposed patch for include/linux/cgroup.h:542 splat ------- Comment on attachment From paulmck.com 2010-10-06 20:24 EDT------- Subrata, does this address it?
I don't know if the messages went away because the newest f14 kernel turned off debug flags or because the real problem is fixed, but I no longer see the INFO walkbacks in the latest kernel: kernel-2.6.35.6-39.fc14.x86_64
Seeing this with Rawhide kernel-2.6.36-0.36.rc7.git3.fc15.x86_64: [ 6263.186242] kvm: 7158: cpu0 unhandled wrmsr: 0x198 data 0 [38892.020603] [38892.020608] =================================================== [38892.020717] [ INFO: suspicious rcu_dereference_check() usage. ] [38892.020790] --------------------------------------------------- [38892.020864] kernel/sched.c:618 invoked rcu_dereference_check() without protection! [38892.020960] [38892.020961] other info that might help us debug this: [38892.020964] [38892.021068] [38892.021070] rcu_scheduler_active = 1, debug_locks = 0 [38892.021156] 1 lock held by pidgin/2160: [38892.021204] #0: (&rq->lock){-.-.-.}, at: [<ffffffff8104583a>] task_fork_fair+0x57/0x109 [38892.021339] [38892.021339] stack backtrace: [38892.021339] Pid: 2160, comm: pidgin Not tainted 2.6.36-0.36.rc7.git3.fc15.x86_64 #1 [38892.021339] Call Trace: [38892.021339] [<ffffffff8107d70c>] lockdep_rcu_dereference+0xaa/0xb3 [38892.021339] [<ffffffff81040334>] task_group+0x80/0x8f [38892.021339] [<ffffffff8104035a>] set_task_rq+0x17/0x73 [38892.021339] [<ffffffff8104585c>] task_fork_fair+0x79/0x109 [38892.021339] [<ffffffff8104a390>] sched_fork+0x106/0x18a [38892.021339] [<ffffffff8104f8d5>] copy_process+0x6bb/0x1208 [38892.021339] [<ffffffff8107cb67>] ? lock_release_holdtime+0x54/0x5b [38892.021339] [<ffffffff810505e8>] do_fork+0x18b/0x382 [38892.021339] [<ffffffff810fb094>] ? might_fault+0x5c/0xac [38892.021339] [<ffffffff8107fe21>] ? lock_release+0x19a/0x1a6 [38892.021339] [<ffffffff81009d2a>] ? sysret_check+0x2e/0x69 [38892.021339] [<ffffffff81011873>] sys_clone+0x28/0x2a [38892.021339] [<ffffffff8100a093>] stub_clone+0x13/0x20 [38892.021339] [<ffffffff81009cf2>] ? system_call_fastpath+0x16/0x1b
Same here uname -r 2.6.36-1.fc14.x86_64 I checked dmesg, because system become quite "not responsible" - samba and ssh [85329.717297] =================================================== [85329.717338] [ INFO: suspicious rcu_dereference_check() usage. ] [85329.717362] --------------------------------------------------- [85329.717386] kernel/sched.c:618 invoked rcu_dereference_check() without protection! [85329.717413] [85329.717415] other info that might help us debug this: [85329.717418] [85329.717448] [85329.717450] rcu_scheduler_active = 1, debug_locks = 0 [85329.717477] 1 lock held by sh/20519: [85329.717494] #0: (&rq->lock){-.-.-.}, at: [<ffffffff8104583a>] task_fork_fair+0x57/0x109 [85329.717624] [85329.717626] stack backtrace: [85329.717789] Pid: 20519, comm: sh Not tainted 2.6.36-1.fc14.x86_64 #1 [85329.717884] Call Trace: [85329.717980] [<ffffffff8107d7ec>] lockdep_rcu_dereference+0xaa/0xb3 [85329.718099] [<ffffffff81040334>] task_group+0x80/0x8f [85329.718196] [<ffffffff8104035a>] set_task_rq+0x17/0x73 [85329.718226] [<ffffffff8104585c>] task_fork_fair+0x79/0x109 [85329.718226] [<ffffffff8104a390>] sched_fork+0x106/0x18a [85329.718226] [<ffffffff8104f8d5>] copy_process+0x6bb/0x1208 [85329.718226] [<ffffffff8107cc47>] ? lock_release_holdtime+0x54/0x5b [85329.718226] [<ffffffff810505e8>] do_fork+0x18b/0x382 [85329.718226] [<ffffffff810fb174>] ? might_fault+0x5c/0xac [85329.718226] [<ffffffff8107ff01>] ? lock_release+0x19a/0x1a6 [85329.718226] [<ffffffff81009d2a>] ? sysret_check+0x2e/0x69 [85329.718226] [<ffffffff81011873>] sys_clone+0x28/0x2a [85329.718226] [<ffffffff8100a093>] stub_clone+0x13/0x20 [85329.718226] [<ffffffff81009cf2>] ? system_call_fastpath+0x16/0x1b
this is a pretty old bug now.. Is it still happening with the latest kernels ?
I haven't seen this is quite a long time.