Bug 1026470

Summary: intermittent "suspicious RCU usage" warnings implicating kprobes/irq-handlers while running systemtap testsuite
Product: [Fedora] Fedora Reporter: Frank Ch. Eigler <fche>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 22CC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, masami.hiramatsu.pt, srostedt
Target Milestone: ---Flags: jforbes: needinfo?
Target Release: ---   
Hardware: i686   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-23 17:20:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Frank Ch. Eigler 2013-11-04 17:49:40 UTC
[ 3703.267825] stap_92b44e5a05375eeb3bfc3fe92be3b28_28739: systemtap: 2.4/0.157, base: f813f000, memory: 17data/24text/3ctx/2058net/34al
loc kb, probes: 3
[ 3723.625404] stap_dba1fa5fe1d005323bddb960291d3436_28996: systemtap: 2.4/0.157, base: f83c9000, memory: 1662data/36text/8ctx/2058net/33alloc kb, probes: 91

[ 3723.632030] ===============================
[ 3723.632030] [ INFO: suspicious RCU usage. ]
[ 3723.632030] 3.12.0-0.rc7.git2.1.fc21.i686+PAE #1 Tainted: GF       W  O
[ 3723.632030] -------------------------------
[ 3723.632030] include/linux/rcupdate.h:775 rcu_read_lock() used illegally while idle!
[ 3723.632030] 
other info that might help us debug this:

[ 3723.632030] 
RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
[ 3723.632030] RCU used illegally from extended quiescent state!
[ 3723.632030] 1 lock held by swapper/1/0:
[ 3723.632030]  #0:  (rcu_read_lock){.+.+..}, at: [<c0a81790>] atomic_notifier_call_chain+0x0/0x110
[ 3723.632030] 
stack backtrace:
[ 3723.632030] CPU: 1 PID: 0 Comm: swapper/1 Tainted: GF       W  O 3.12.0-0.rc7.git2.1.fc21.i686+PAE #1
[ 3723.632030] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 3723.632030]  00000000 00000000 f4d41e40 c0a74cb9 00000001 f4d41e68 c04cc956 c0c3b254
[ 3723.632030]  c0c2ef2c 00000001 00000000 f4d3ec00 00000002 f4d41e94 c0d2e7e0 f4d41e8c
[ 3723.632030]  c0a8186c 00000002 00000001 00000000 c0a81790 f4d41ecc 00000000 00000001
[ 3723.632030] Call Trace:
[ 3723.632030]  [<c0a74cb9>] dump_stack+0x4b/0x66
[ 3723.632030]  [<c04cc956>] lockdep_rcu_suspicious+0xd6/0x110
[ 3723.632030]  [<c0a8186c>] atomic_notifier_call_chain+0xdc/0x110
[ 3723.632030]  [<c0a81790>] ? __atomic_notifier_call_chain+0x110/0x110
[ 3723.632030]  [<c0a818cd>] notify_die+0x2d/0x30
[ 3723.632030]  [<c0a7e53f>] do_int3+0x7f/0x100
[ 3723.632030]  [<c0a7e258>] int3+0x38/0x40
[ 3723.632030]  [<c04900e0>] ? ftrace_raw_event_sched_process_fork+0x90/0x100
[ 3723.632030]  [<c0465fd1>] ? irq_enter+0x1/0x80
[ 3723.632030]  [<c0442eee>] ? smp_apic_timer_interrupt+0x2e/0x50
[ 3723.632030]  [<c0a7dc95>] apic_timer_interrupt+0x39/0x40
[ 3723.632030]  [<c044e8f5>] ? native_safe_halt+0x5/0x10
[ 3723.632030]  [<c041d6b1>] default_idle+0x21/0x220
[ 3723.632030]  [<c041e056>] arch_cpu_idle+0x36/0x40
[ 3723.632030]  [<c04b8e3b>] cpu_startup_entry+0x9b/0x350
[ 3723.632030]  [<c044279b>] ? setup_APIC_timer+0xab/0x130
[ 3723.632030]  [<c0440fcd>] start_secondary+0x21d/0x2e0


The trigger appears to be the systemtap pr14546.stp test case:

https://www.sourceware.org/git/gitweb.cgi?p=systemtap.git;a=blob;f=testsuite/systemtap.base/pr14546.exp
https://www.sourceware.org/git/gitweb.cgi?p=systemtap.git;a=blob;f=testsuite/systemtap.base/pr14546.stp

wherein kprobes are placed on all permitted kernel.function("irq_*").call
functions, and backtraces generated.  These tests appear to run correctly,
as far as systemtap is concerned.  For for this particular kernel, those
irq_* probes expand to these functions:

irq_disable@arch/x86/lguest/boot.c:242
irq_ctx_init@arch/x86/kernel/irq_32.c:122
irq_fpu_usable@arch/x86/kernel/i387.c:64
irq_force_complete_move@arch/x86/kernel/apic/io_apic.c:2272
irq_complete_move@arch/x86/kernel/apic/io_apic.c:2267
irq_polarity@arch/x86/kernel/apic/io_apic.c:851
irq_trigger@arch/x86/kernel/apic/io_apic.c:893
irq_handler@arch/x86/kernel/vm86_32.c:742
irq_exit@kernel/softirq.c:362
irq_enter@kernel/softirq.c:311
irq_to_desc@kernel/irq/irqdesc.c:111
irq_get_next_irq@kernel/irq/irqdesc.c:422
irq_reserve_irqs@kernel/irq/irqdesc.c:398
irq_set_percpu_devid@kernel/irq/irqdesc.c:458
irq_free_descs@kernel/irq/irqdesc.c:326
irq_nested_primary_handler@kernel/irq/manage.c:650
irq_default_primary_handler@kernel/irq/manage.c:641
irq_set_irq_wake@kernel/irq/manage.c:507
irq_set_affinity@kernel/irq/manage.c:196
irq_thread_fn@kernel/irq/manage.c:793
irq_set_affinity_notifier@kernel/irq/manage.c:261
irq_thread@kernel/irq/manage.c:840
irq_select_affinity_usr@kernel/irq/manage.c:340
irq_forced_thread_fn@kernel/irq/manage.c:777
irq_can_set_affinity@kernel/irq/manage.c:87
irq_finalize_oneshot@kernel/irq/manage.c:679
irq_thread_dtor@kernel/irq/manage.c:810
irq_thread_check_affinity@kernel/irq/manage.c:733
irq_set_thread_affinity@kernel/irq/manage.c:107
irq_affinity_notify@kernel/irq/manage.c:224
irq_set_affinity_hint@kernel/irq/manage.c:211
irq_do_set_affinity@kernel/irq/manage.c:146
irq_wait_for_poll@kernel/irq/spurious.c:38
irq_get_irq_data@kernel/irq/chip.c:147
irq_percpu_enable@kernel/irq/chip.c:238
irq_modify_status@kernel/irq/chip.c:704
irq_set_chip_and_handler_name@kernel/irq/chip.c:696
irq_enable@kernel/irq/chip.c:206
irq_set_irq_type@kernel/irq/chip.c:56
irq_percpu_disable@kernel/irq/chip.c:247
irq_set_msi_desc_off@kernel/irq/chip.c:100
irq_set_chip@kernel/irq/chip.c:28
irq_set_handler_data@kernel/irq/chip.c:79
irq_disable@kernel/irq/chip.c:229
irq_cpu_online@kernel/irq/chip.c:736
irq_shutdown@kernel/irq/chip.c:193
irq_set_msi_desc@kernel/irq/chip.c:122
irq_startup@kernel/irq/chip.c:175
irq_set_chip_data@kernel/irq/chip.c:134
irq_cpu_offline@kernel/irq/chip.c:766
irq_domain_associate_many@kernel/irq/irqdomain.c:324
irq_create_strict_mappings@kernel/irq/irqdomain.c:453
irq_find_host@kernel/irq/irqdomain.c:189
irq_domain_associate@kernel/irq/irqdomain.c:269
irq_domain_remove@kernel/irq/irqdomain.c:76
irq_dispose_mapping@kernel/irq/irqdomain.c:509
irq_domain_xlate_twocell@kernel/irq/irqdomain.c:672
irq_create_mapping@kernel/irq/irqdomain.c:387
irq_set_default_host@kernel/irq/irqdomain.c:225
irq_domain_add_legacy@kernel/irq/irqdomain.c:165
irq_find_mapping@kernel/irq/irqdomain.c:531
irq_create_of_mapping@kernel/irq/irqdomain.c:468
irq_create_direct_mapping@kernel/irq/irqdomain.c:348
irq_domain_xlate_onecell@kernel/irq/irqdomain.c:653
irq_domain_xlate_onetwocell@kernel/irq/irqdomain.c:695
irq_domain_add_simple@kernel/irq/irqdomain.c:122
irq_affinity_proc_show@kernel/irq/proc.c:65
irq_spurious_proc_open@kernel/irq/proc.c:257
irq_affinity_list_proc_open@kernel/irq/proc.c:137
irq_node_proc_show@kernel/irq/proc.c:226
irq_affinity_proc_open@kernel/irq/proc.c:132
irq_node_proc_open@kernel/irq/proc.c:234
irq_affinity_hint_proc_show@kernel/irq/proc.c:39
irq_affinity_list_proc_show@kernel/irq/proc.c:70
irq_affinity_proc_write@kernel/irq/proc.c:120
irq_spurious_proc_show@kernel/irq/proc.c:247
irq_affinity_hint_proc_open@kernel/irq/proc.c:142
irq_affinity_list_proc_write@kernel/irq/proc.c:126
irq_move_irq@kernel/irq/migration.c:51
irq_move_masked_irq@kernel/irq/migration.c:7
irq_pm_syscore_resume@kernel/irq/pm.c:67
irq_work_queue@kernel/irq_work.c:64
irq_work_needs_cpu@kernel/irq_work.c:89
irq_work_run@kernel/irq_work.c:153
irq_work_cpu_notify@kernel/irq_work.c:174
irq_work_sync@kernel/irq_work.c:164
irq_cpu_rmap_notify@lib/cpu_rmap.c:253
irq_cpu_rmap_release@lib/cpu_rmap.c:268
irq_cpu_rmap_add@lib/cpu_rmap.c:288
irq_show@drivers/pci/pci-sysfs.c:53
irq_from_evtchn@drivers/xen/events.c:241

From the dmesg entry, it appears that the post-int3 or kprobes-on-ftrace
handling for irq_entry() was broken.  Note that "perf probe" appears
uncooperative:

sudo perf probe -a irq_entry 
Probe point 'irq_entry' not found.
  Error: Failed to add events. (-2)

Comment 1 Jaroslav Reznik 2015-03-03 15:11:15 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle.
Changing version to '22'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22

Comment 2 Justin M. Forbes 2015-10-20 19:34:33 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 22 kernel bugs.

Fedora 22 has now been rebased to 4.2.3-200.fc22.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 23, and are still experiencing this issue, please change the version to Fedora 23.

If you experience different issues, please open a new bug report for those.

Comment 3 Fedora Kernel Team 2015-11-23 17:20:10 UTC
*********** MASS BUG UPDATE **************
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in over 4 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.