Bug 715418

Summary: ftrace causing large latency when inserting module
Product: Red Hat Enterprise MRG Reporter: Steven Rostedt <srostedt>
Component: realtime-kernelAssignee: Luis Claudio R. Goncalves <lgoncalv>
Status: CLOSED ERRATA QA Contact: David Sommerseth <davids>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 2.0CC: bhu, jkodak, lgoncalv, ovasik, williams
Target Milestone: 2.0.2Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Interrupts are disabled before ftrace performs the code modification of mcounts into nops. Consequence: Large latencies (up to 1ms) could be observed when inserting modules during an ftrace session. Fix: The interrupts only need to be disabled on start up around this code, not when modules are being loaded. Result: These ftrace induced latencies have been removed.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-08-22 05:57:04 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Do not disable interrupts for modules in mcount update none

Description Steven Rostedt 2011-06-22 20:32:31 UTC
Created attachment 506081 [details]
Do not disable interrupts for modules in mcount update

Description of problem:
When I mounted an NFS directory, it caused several modules to be loaded. At the time I was running the preemptirqsoff tracer, and it showed the following output:

# tracer: preemptirqsoff
#
# preemptirqsoff latency trace v1.1.5 on 2.6.33.9-rt30-mrg-test
# --------------------------------------------------------------------
# latency: 1177 us, #4/4, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
#    -----------------
#    | task: modprobe-19370 (uid:0 nice:0 policy:0 rt_prio:0)
#    -----------------
#  => started at: ftrace_module_notify
#  => ended at:   ftrace_module_notify
#
#
#                  _------=> CPU#            
#                 / _-----=> irqs-off        
#                | / _----=> need-resched    
#                || / _---=> hardirq/softirq 
#                ||| / _--=> preempt-depth   
#                |||| /_--=> lock-depth       
#                |||||/     delay             
#  cmd     pid   |||||| time  |   caller      
#     \   /      ||||||   \   |   /           
modprobe-19370   3d....    0us!: ftrace_process_locs <-ftrace_module_notify
modprobe-19370   3d.... 1176us : ftrace_process_locs <-ftrace_module_notify
modprobe-19370   3d.... 1178us : trace_hardirqs_on <-ftrace_module_notify
modprobe-19370   3d.... 1178us : <stack trace>
 => ftrace_process_locs
 => ftrace_module_notify
 => notifier_call_chain
 => __blocking_notifier_call_chain
 => blocking_notifier_call_chain
 => sys_init_module
 => system_call_fastpath

That's over 1ms that interrupts are disabled on a Real-Time kernel!

Looking at the cause (being the ftrace author helped), I found that the interrupts are disabled before the code modification of mcounts into nops. The interrupts only need to be disabled on start up around this code, not when modules are being loaded.

Comment 2 Luis Claudio R. Goncalves 2011-06-27 22:16:19 UTC
There was a confusion in the patch queue. This patch is indeed needed and has been added to kernel-rt-2.6.33.9-rt31.69.

Comment 3 Luis Claudio R. Goncalves 2011-07-18 16:44:25 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: Interrupts are disabled before ftrace performs the code modification of mcounts into nops.
Consequence: Large latencies (up to 1ms) could be observed when inserting modules during an ftrace session. 
Fix: The interrupts only need to be disabled on start up around this code, not when modules are being loaded.
Result: These ftrace induced latencies have been removed.

Comment 4 David Sommerseth 2011-08-11 16:51:55 UTC
Verified by running a simple reproducing routine.

* Boot kernel-rt-trace
* mount -t debugfs none /sys/kernel/debug/
* echo preemptirqsoff > /sys/kernel/debug/tracing/current_tracer
* modprobe nfs
* cat /sys/kernel/debug/tracing/trace

Tried this routine on both 2.6.33.9-rt31.66.el6rt and 2.6.33.9-rt31.73.el6rt.  On the former kernel, a long latency lasting over a second was observed.  On the new kernel this latency was not found.

Comment 5 errata-xmlrpc 2011-08-22 05:57:04 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1192.html