626026 – INFO: suspicious rcu_dereference_check() usage - kernel/sched.c:618 invoked rcu_dereference_check() without protection!

Bug 626026 - INFO: suspicious rcu_dereference_check() usage - kernel/sched.c:618 invoked rcu_dereference_check() without protection!

Summary: INFO: suspicious rcu_dereference_check() usage - kernel/sched.c:618 invoked r...

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	rawhide
Hardware:	All
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Assignee:	Kernel Maintainer List
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	622149 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2010-08-21 14:35 UTC by Tom London
Modified:	2012-02-06 18:44 UTC (History)
CC List:	14 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2012-02-06 18:38:05 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Output of 'dmesg' showing 'suspicious rcu_dereference_check() usage.' (110.90 KB, text/plain) 2010-08-21 14:35 UTC, Tom London	no flags	Details
dmesg (56.18 KB, text/plain) 2010-09-21 20:54 UTC, IBM Bug Proxy	no flags	Details
Proposed patch for include/linux/cgroup.h:542 splat (2.83 KB, text/plain) 2010-10-07 00:31 UTC, IBM Bug Proxy	no flags	Details
View All

Description Tom London 2010-08-21 14:35:34 UTC

Created attachment 440124 [details]
Output of 'dmesg' showing 'suspicious rcu_dereference_check() usage.'

Description of problem:
Noticed this in dmesg.

Complete dmesg attached.

initcall init_trace_printk+0x0/0x12 returned 0 after 0 usecs
lockdep: fixing up alternatives.

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
kernel/sched.c:618 invoked rcu_dereference_check() without protection!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 0
3 locks held by kworker/0:0/4:
 #0:  (events){+.+.+.}, at: [<ffffffff81067e11>] process_one_work+0x160/0x2ec
 #1:  ((&c_idle.work)){+.+.+.}, at: [<ffffffff81067e11>] process_one_work+0x160/0x2ec
 #2:  (&rq->lock){-.....}, at: [<ffffffff81490279>] init_idle+0x30/0x131

stack backtrace:
Pid: 4, comm: kworker/0:0 Not tainted 2.6.36-0.6.rc1.git3.fc15.x86_64 #1
Call Trace:
 [<ffffffff8107d944>] lockdep_rcu_dereference+0xaa/0xb3
 [<ffffffff810401ce>] task_group+0x80/0x8f
 [<ffffffff810401f4>] set_task_rq+0x17/0x73
 [<ffffffff81490333>] init_idle+0xea/0x131
 [<ffffffff81490703>] fork_idle+0x92/0xa3
 [<ffffffff8101050d>] ? sched_clock+0x9/0xd
 [<ffffffff8148e17c>] do_fork_idle+0x1c/0x2d
 [<ffffffff81067e6d>] process_one_work+0x1bc/0x2ec
 [<ffffffff81067e11>] ? process_one_work+0x160/0x2ec
 [<ffffffff8148e160>] ? do_fork_idle+0x0/0x2d
 [<ffffffff81068d6c>] ? manage_workers.clone.9+0xe0/0x173
 [<ffffffff81068f03>] worker_thread+0x104/0x19b
 [<ffffffff81068dff>] ? worker_thread+0x0/0x19b
 [<ffffffff8106c84c>] kthread+0x9d/0xa5
 [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10
 [<ffffffff81498150>] ? restore_args+0x0/0x30
 [<ffffffff8106c7af>] ? kthread+0x0/0xa5
 [<ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10
Booting Node   0, Processors  #1
NMI watchdog enabled, takes one hw-pmu counter.
Brought up 2 CPUs
Total of 2 processors activated (9575.89 BogoMIPS).
sizeof(vma)=176 bytes
sizeof(page)=56 bytes
sizeof(inode)=1000 bytes
sizeof(dentry)=256 bytes
sizeof(ext3inode)=1432 bytes
sizeof(buffer_head)=104 bytes
sizeof(skbuff)=240 bytes
sizeof(task_struct)=9200 bytes
devtmpfs: initialized
calling  init_mmap_min_addr+0x0/0xd @ 1
initcall init_mmap_min_addr+0x0/0xd returned 0 after 0 usecs


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Tom London 2010-08-21 14:36:20 UTC

Oops.  Quick trigger finger...

Version-Release: kernel-2.6.36-0.6.rc1.git3.fc15.x86_64

Comment 2 Tom Horsley 2010-08-26 00:54:24 UTC

I just installed fedora f14 alpha x86_64 and applied updates, and I
also see the same error in my dmesg log with kernel 2.6.35.2-9.fc14.x86_64

Here's the log fragment:

CPU0: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz stepping 0a
lockdep: fixing up alternatives.

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
kernel/sched.c:616 invoked rcu_dereference_check() without protection!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 0
3 locks held by swapper/1:
 #0:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81052b6b>] cpu_maps_update_begin+0x17/0x19
 #1:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81052a7e>] cpu_hotplug_begin+0x2c/0x53
 #2:  (&rq->lock){-.....}, at: [<ffffffff81491c05>] init_idle+0x30/0x131

stack backtrace:
Pid: 1, comm: swapper Not tainted 2.6.35.2-9.fc14.x86_64 #1
Call Trace:
 [<ffffffff8107bca2>] lockdep_rcu_dereference+0xaa/0xb3
 [<ffffffff8103fdc5>] task_group+0x80/0x8f
 [<ffffffff8103fdeb>] set_task_rq+0x17/0x73
 [<ffffffff81491cbf>] init_idle+0xea/0x131
 [<ffffffff8149208f>] fork_idle+0x92/0xa3
 [<ffffffff8107e788>] ? mark_held_locks+0x50/0x72
 [<ffffffff8148fb30>] do_fork_idle+0x1c/0x2d
 [<ffffffff8148fc78>] do_boot_cpu+0x137/0x9b1
 [<ffffffff8148fb14>] ? do_fork_idle+0x0/0x2d
 [<ffffffff81490d16>] native_cpu_up+0x100/0x1c2
 [<ffffffff81492168>] _cpu_up+0x9d/0xf9
 [<ffffffff81492297>] cpu_up+0xd3/0xe5
 [<ffffffff81d78d86>] kernel_init+0x105/0x2c9
 [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10
 [<ffffffff81499450>] ? restore_args+0x0/0x30
 [<ffffffff81d78c81>] ? kernel_init+0x0/0x2c9
 [<ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10
Booting Node   0, Processors  #1lockdep: fixing up alternatives.
 #2lockdep: fixing up alternatives.
 #3 Ok.
Brought up 4 CPUs
Total of 4 processors activated (22588.67 BogoMIPS).
sizeof(vma)=176 bytes
sizeof(page)=56 bytes
sizeof(inode)=1168 bytes
sizeof(dentry)=256 bytes
sizeof(ext3inode)=1600 bytes
sizeof(buffer_head)=104 bytes
sizeof(skbuff)=240 bytes
sizeof(task_struct)=9216 bytes
devtmpfs: initialized
atomic64 test passed for x86-64 platform with CX8 and with SSE

Comment 3 Tom Horsley 2010-08-28 03:07:39 UTC

In case anyone is interested, I just got the new 2.6.35.4-12.fc14.x86_64
kernel update, and I still see the same problem at boot.

Comment 4 Tom Horsley 2010-09-10 16:42:40 UTC

This mysterious error still scrolls past when booting the x86_64 Beta TC1 iso
image.

Comment 5 James Laska 2010-09-13 12:59:51 UTC

The question for this bug is whether the issue is specific to your hardware, or generic enough for most users.  I *suspect* this would be covered under the following Final release criteria, but this would be contingent upon how common this failure is for other users:

  "In most cases, there must be no SELinux 'AVC: denied' messages or abrt crash notifications on initial boot and subsequent login"
        
Out of curiousity, does ABRT detect and notify you of the kernel crash?

Comment 6 Tom London 2010-09-13 13:08:44 UTC

I do not receive notification from ABRT.

Comment 7 Tom Horsley 2010-09-18 14:08:46 UTC

Just got a kernel update to kernel-2.6.35.4-28.fc14.x86_64, I still see
a similar INFO message in /var/log/dmesg, but the details of the
walkback are quite different:

...
atomic64 test passed for x86-64 platform with CX8 and with SSE
Time: 13:56:59  Date: 09/18/10
NET: Registered protocol family 16

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
include/linux/cgroup.h:542 invoked rcu_dereference_check() without protection!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 0
1 lock held by swapper/1:
 #0:  (net_mutex){+.+.+.}, at: [<ffffffff813e9010>] register_pernet_subsys+0x1f/0x47

stack backtrace:
Pid: 1, comm: swapper Not tainted 2.6.35.4-28.fc14.x86_64 #1
Call Trace:
 [<ffffffff8107bd3a>] lockdep_rcu_dereference+0xaa/0xb3
 [<ffffffff813e04b9>] sock_update_classid+0x7c/0xa2
 [<ffffffff813e054a>] sk_alloc+0x6b/0x77
 [<ffffffff8140b281>] __netlink_create+0x37/0xab
 [<ffffffff813f941c>] ? rtnetlink_rcv+0x0/0x2d
 [<ffffffff8140cee1>] netlink_kernel_create+0x74/0x19d
 [<ffffffff8149c3ca>] ? __mutex_lock_common+0x339/0x35b
 [<ffffffff813f7e9c>] rtnetlink_net_init+0x2e/0x48
 [<ffffffff813e8d7a>] ops_init+0xe9/0xff
 [<ffffffff813e8f0d>] register_pernet_operations+0xab/0x130
 [<ffffffff813e901f>] register_pernet_subsys+0x2e/0x47
 [<ffffffff81db7bca>] rtnetlink_init+0x53/0x102
 [<ffffffff81db835c>] netlink_proto_init+0x126/0x143
 [<ffffffff81db8236>] ? netlink_proto_init+0x0/0x143
 [<ffffffff810021b8>] do_one_initcall+0x72/0x186
 [<ffffffff81d78ebc>] kernel_init+0x23b/0x2c9
 [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10
 [<ffffffff8149e2d0>] ? restore_args+0x0/0x30
 [<ffffffff81d78c81>] ? kernel_init+0x0/0x2c9
 [<ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10
ACPI: bus type pci registered
PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0xe0000000-0xefffffff] (base 0xe0000000)
PCI: not using MMCONFIG
PCI: Using configuration type 1 for base access
bio: create slab <bio-0> at 0
...

In case hardware info for this machine is useful, the smolt profile
is here:

http://www.smolts.org/client/show/pub_36882a52-8a36-46bd-920d-17d1a49dbfda

Comment 8 Jon Masters 2010-09-19 05:53:18 UTC

I am seeing this traceback on all recent rawhide and upstream kernels. Lenovo T60p laptop system. Poking into it a little tonight.

Comment 9 Jeff Raber 2010-09-21 20:32:11 UTC

(In reply to comment #7)
> Just got a kernel update to kernel-2.6.35.4-28.fc14.x86_64, I still see
> a similar INFO message in /var/log/dmesg, but the details of the
> walkback are quite different:
> 

You are seeing bug 572520.

Comment 10 Jeff Raber 2010-09-21 20:34:31 UTC

Duping to 622149

*** This bug has been marked as a duplicate of bug 622149 ***

Comment 11 Jeff Raber 2010-09-21 20:49:16 UTC

Reopening. I didn't realize that this bug is a F14blocker.  Sorry for the spam.

Comment 12 Jeff Raber 2010-09-21 20:51:35 UTC

*** Bug 622149 has been marked as a duplicate of this bug. ***

Comment 13 IBM Bug Proxy 2010-09-21 20:54:01 UTC

Created attachment 448807 [details]
dmesg

Comment 14 Chuck Ebbert 2010-09-24 02:10:13 UTC

Why is this a blocker?

Comment 15 Jon Masters 2010-09-30 04:42:11 UTC

I don't think it should be a blocker. It's not causing crashes directly that I know of.

Comment 16 IBM Bug Proxy 2010-10-04 06:21:37 UTC

------- Comment From Subrata Modak subrata.ibm.com 2010-10-04 02:11 EDT-------
This is still seen with Fedora 14 Beta:




node 0 link 0: io port [1000, 4fff]

Regards--
Subrata

Comment 17 IBM Bug Proxy 2010-10-07 00:31:28 UTC

Created attachment 452016 [details]
Proposed patch for include/linux/cgroup.h:542 splat


------- Comment on attachment From paulmck.com 2010-10-06 20:24 EDT-------


Subrata, does this address it?

Comment 18 Tom Horsley 2010-10-10 00:42:39 UTC

I don't know if the messages went away because the newest f14 kernel
turned off debug flags or because the real problem is fixed, but I no
longer see the INFO walkbacks in the latest kernel:

kernel-2.6.35.6-39.fc14.x86_64

Comment 19 Tom London 2010-10-14 04:41:58 UTC

Seeing this with Rawhide kernel-2.6.36-0.36.rc7.git3.fc15.x86_64:


[ 6263.186242] kvm: 7158: cpu0 unhandled wrmsr: 0x198 data 0
[38892.020603] 
[38892.020608] ===================================================
[38892.020717] [ INFO: suspicious rcu_dereference_check() usage. ]
[38892.020790] ---------------------------------------------------
[38892.020864] kernel/sched.c:618 invoked rcu_dereference_check() without protection!
[38892.020960] 
[38892.020961] other info that might help us debug this:
[38892.020964] 
[38892.021068] 
[38892.021070] rcu_scheduler_active = 1, debug_locks = 0
[38892.021156] 1 lock held by pidgin/2160:
[38892.021204]  #0:  (&rq->lock){-.-.-.}, at: [<ffffffff8104583a>] task_fork_fair+0x57/0x109
[38892.021339] 
[38892.021339] stack backtrace:
[38892.021339] Pid: 2160, comm: pidgin Not tainted 2.6.36-0.36.rc7.git3.fc15.x86_64 #1
[38892.021339] Call Trace:
[38892.021339]  [<ffffffff8107d70c>] lockdep_rcu_dereference+0xaa/0xb3
[38892.021339]  [<ffffffff81040334>] task_group+0x80/0x8f
[38892.021339]  [<ffffffff8104035a>] set_task_rq+0x17/0x73
[38892.021339]  [<ffffffff8104585c>] task_fork_fair+0x79/0x109
[38892.021339]  [<ffffffff8104a390>] sched_fork+0x106/0x18a
[38892.021339]  [<ffffffff8104f8d5>] copy_process+0x6bb/0x1208
[38892.021339]  [<ffffffff8107cb67>] ? lock_release_holdtime+0x54/0x5b
[38892.021339]  [<ffffffff810505e8>] do_fork+0x18b/0x382
[38892.021339]  [<ffffffff810fb094>] ? might_fault+0x5c/0xac
[38892.021339]  [<ffffffff8107fe21>] ? lock_release+0x19a/0x1a6
[38892.021339]  [<ffffffff81009d2a>] ? sysret_check+0x2e/0x69
[38892.021339]  [<ffffffff81011873>] sys_clone+0x28/0x2a
[38892.021339]  [<ffffffff8100a093>] stub_clone+0x13/0x20
[38892.021339]  [<ffffffff81009cf2>] ? system_call_fastpath+0x16/0x1b

Comment 20 Michał Piotrowski 2010-10-23 16:35:18 UTC

Same here

uname -r
2.6.36-1.fc14.x86_64

I checked dmesg, because system become quite "not responsible" - samba and ssh

[85329.717297] ===================================================
[85329.717338] [ INFO: suspicious rcu_dereference_check() usage. ]
[85329.717362] ---------------------------------------------------
[85329.717386] kernel/sched.c:618 invoked rcu_dereference_check() without protection!
[85329.717413]
[85329.717415] other info that might help us debug this:
[85329.717418]
[85329.717448]
[85329.717450] rcu_scheduler_active = 1, debug_locks = 0
[85329.717477] 1 lock held by sh/20519:
[85329.717494]  #0:  (&rq->lock){-.-.-.}, at: [<ffffffff8104583a>] task_fork_fair+0x57/0x109
[85329.717624]
[85329.717626] stack backtrace:
[85329.717789] Pid: 20519, comm: sh Not tainted 2.6.36-1.fc14.x86_64 #1
[85329.717884] Call Trace:
[85329.717980]  [<ffffffff8107d7ec>] lockdep_rcu_dereference+0xaa/0xb3
[85329.718099]  [<ffffffff81040334>] task_group+0x80/0x8f
[85329.718196]  [<ffffffff8104035a>] set_task_rq+0x17/0x73
[85329.718226]  [<ffffffff8104585c>] task_fork_fair+0x79/0x109
[85329.718226]  [<ffffffff8104a390>] sched_fork+0x106/0x18a
[85329.718226]  [<ffffffff8104f8d5>] copy_process+0x6bb/0x1208
[85329.718226]  [<ffffffff8107cc47>] ? lock_release_holdtime+0x54/0x5b
[85329.718226]  [<ffffffff810505e8>] do_fork+0x18b/0x382
[85329.718226]  [<ffffffff810fb174>] ? might_fault+0x5c/0xac
[85329.718226]  [<ffffffff8107ff01>] ? lock_release+0x19a/0x1a6
[85329.718226]  [<ffffffff81009d2a>] ? sysret_check+0x2e/0x69
[85329.718226]  [<ffffffff81011873>] sys_clone+0x28/0x2a
[85329.718226]  [<ffffffff8100a093>] stub_clone+0x13/0x20
[85329.718226]  [<ffffffff81009cf2>] ? system_call_fastpath+0x16/0x1b

Comment 21 Dave Jones 2012-02-06 17:38:50 UTC

this is a pretty old bug now.. Is it still happening with the latest kernels ?

Comment 22 Tom London 2012-02-06 18:26:11 UTC

I haven't seen this is quite a long time.

Note You need to log in before you can comment on or make changes to this bug.