Bug 173304

Summary: Fix for SystemTap bugzilla #1345 - return probe on do_execve
Product: Red Hat Enterprise Linux 4 Reporter: Jim Keniston <kenistoj>
Component: kernelAssignee: Dave Anderson <anderson>
Status: CLOSED ERRATA QA Contact: Jay Turner <jturner>
Severity: high Docs Contact:
Priority: medium    
Version: 4.0CC: jbaron, lwang, srevivo
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2006-0132 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-03-07 20:48:16 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 168429    
Attachments:
Description Flags
Sample module to illustrate the bug
none
This patch fixes the bug on all architectures.
none
Patch for RHEL4 U3
none
fixed version of rpfix-rhel4u3.patch none

Description Jim Keniston 2005-11-16 06:24:21 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20030225

Description of problem:
Kprobes has a bug described in SystemTap bugzilla #1345.  I have a patch for this bug, and would like to see the fix included in RHEL4 U3.  Elena Zannoni advised me to provide the patch via a new bugzilla entry.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Using the kprobes API, build a module that registers a return probe on sys_execve, do_execve, load_*_binary, flush_old_exec, or flush_thread.  Sample module will be attached.
2. insmod the module.
3. Run one or more binary executables -- e.g., "touch x".
  

Actual Results:  /var/log/messages shows a BUG report: "... kernel BUG at arch/i386/kernel/kprobes.c:310!"

The system may even crash or hang.

Expected Results:  No BUG report.  The system should continue normal operation.

Additional info:

Comment 1 Jim Keniston 2005-11-16 06:33:33 UTC
Created attachment 121110 [details]
Sample module to illustrate the bug

Compile and insmod this module to demonstrate the bug.

If the bug is fixed, /var/log/messages should show output such as the
following:
Nov 15 22:01:17 xxx kernel: Registering probes for sys_execve
Nov 15 22:01:17 xxx kernel: Registering probes for do_execve
Nov 15 22:01:17 xxx kernel: Registering probes for load_elf_binary
Nov 15 22:01:17 xxx kernel: Registering probes for flush_old_exec
[The following lines are displayed after you rmmod the module.	The number of
calls and returns depends on how many commands you run between insmod and
rmmod.]
Nov 15 22:01:45 xxx kernel: sys_execve: 21 calls, 21 returns, 0 missed
Nov 15 22:01:45 xxx kernel: do_execve: 21 calls, 21 returns, 0 missed
Nov 15 22:01:45 xxx kernel: load_elf_binary: 21 calls, 21 returns, 0 missed
Nov 15 22:01:45 xxx kernel: flush_old_exec: 21 calls, 21 returns, 0 missed

Comment 2 Jim Keniston 2005-11-16 06:37:37 UTC
Created attachment 121111 [details]
This patch fixes the bug on all architectures.

This patch has been tested on i386 and ppc64.  It will be tested on ia64 and
x86_64 by Nov. 16.

Comment 3 Jim Keniston 2005-11-17 01:45:47 UTC
Created attachment 121155 [details]
Patch for RHEL4 U3

This patch applies to RHEL4 U3.  The previously provided patch applies to the
upstream kernel, v2.6.15-rc1.

Comment 5 Dave Anderson 2005-11-18 18:42:57 UTC
> Patch for RHEL4 U3
>
> This patch applies to RHEL4 U3.  The previously provided patch applies to the
> upstream kernel, v2.6.15-rc1.

This patch no longer applies to the current RHEL4 tree.  Please provide
a fixed version:

$ patch -p1 --dry-run < $HOME/rpfix-rhel4u3.patch
patching file arch/i386/kernel/process.c
Hunk #1 succeeded at 337 (offset 2 lines).
patching file arch/ia64/kernel/process.c
Hunk #1 FAILED at 25.
Hunk #2 succeeded at 696 (offset 1 line).
1 out of 2 hunks FAILED -- saving rejects to file
arch/ia64/kernel/process.c.rejpatching file arch/ppc64/kernel/process.c
patching file arch/x86_64/kernel/process.c
$

...as well as posting ia64 and x86_64 test results.  Please also provide
a short explanation of what the original problem actually is, and how the
removal of the kprobe_flush_task() call from the 3 processor-specific
flush_thread() calls does, addresses the issue.  

Sorry -- I have no experience or insight into kprobes/SystemTap...





Comment 6 Dave Anderson 2005-11-18 21:19:03 UTC
Created attachment 121255 [details]
fixed version of rpfix-rhel4u3.patch


The attached patch applies cleanly to the current RHEL4-U3 tree.

Comment 7 Jim Keniston 2005-11-19 00:43:51 UTC
In response to Comment #5:

Test results for x86_64 and ia64:
ia64 (as tested by anil.s.keshavamurthy) and x86_64 (as tested by me)
produce the desired results as described in Comment #1.

Problem description (sorry, it's not short):

From Documentation/kprobes.txt in the mainline kernel:
-----
1.3 How Does a Return Probe Work?
                                                                                
When you call register_kretprobe(), Kprobes establishes a kprobe at
the entry to the function.  When the probed function is called and this
probe is hit, Kprobes saves a copy of the return address, and replaces
the return address with the address of a "trampoline."  The trampoline
is an arbitrary piece of code -- typically just a nop instruction.
At boot time, Kprobes registers a kprobe at the trampoline.
                                                                                
When the probed function executes its return instruction, control
passes to the trampoline and that probe is hit.  Kprobes' trampoline
handler calls the user-specified handler associated with the kretprobe,
then sets the saved instruction pointer to the saved return address,
and that's where execution resumes upon return from the trap.
                                                                                
While the probed function is executing, its return address is
stored in an object of type kretprobe_instance.  Before calling
register_kretprobe(), the user sets the maxactive field of the
kretprobe struct to specify how many instances of the specified
function can be probed simultaneously.  register_kretprobe()
pre-allocates the indicated number of kretprobe_instance objects.
-----
If a return-probed function never returns, the kretprobe_instance object will
never be recycled, and you'll quickly run out.  So when a program image is going
away (e.g., via do_exit()), we call kprobe_flush_task() to recycle all that
task's  kretprobe_instance objects.

do_execve() also discards the program image, so our original implementation also
called kprobe_flush_task() from flush_thread() (which is called from
do_execve()).  This was a mistake, since do_execve() retains the stack and does
indeed return.

For reasons I won't go into, this worked fine on the original architectures
(i386 and x86_64) but not on ppc64 and ia64.  The architectures got out of sync
(and RHEL4 got out of sync with the mainline kernel, apparently), and a
subsequent change to Kprobes changed this harmless mistake to a fatal one.

Correct practice is to call kprobe_flush_task() from do_exit() (via
exit_thread()), but not from do_execve() (via flush_thread()).  The patch
associated with comment #2 fixes this in the mainline kernel, and the patch in
#6 (but not #3, apparently) fixes this in RHEL4 U3.

Comment 8 Dave Anderson 2005-11-21 14:37:08 UTC
> The patch associated with comment #2 fixes this in the mainline kernel,
> and the patch in #6 (but not #3, apparently) fixes this in RHEL4 U3.

Ok -- before this can be proposed for RHEL4, we need absolute test
results running i686, x86_64, ppc64 and ia64 RHEL4 kernels.  

In this location:

  http://people.redhat.com/anderson/BZ_173304

are the following binary rpms:

  kernel-smp-2.6.9-22.20.EL.bz173304.i686.rpm
  kernel-hugemem-2.6.9-22.20.EL.bz173304.i686.rpm
  kernel-smp-2.6.9-22.20.EL.bz173304.x86_64.rpm
  kernel-2.6.9-22.20.EL.bz173304.ia64.rpm
  kernel-2.6.9-22.20.EL.bz173304.ppc64.rpm

and the associated src.rpm used to build them:

  kernel-2.6.9-22.20.EL.bz173304.src.rpm

and the patch in the src.rpm that was applied to 2.6.9-22.20.EL:

  linux-kernel-test.patch

Note that there are both smp and hugemem i686 kernels.  We need test
results from both i686 kernels, due to a major screw-up with the original
kprobes patch that caused the use of gdb breakpoints on a user application
to crash the hugemem kernel. 


Comment 9 Jim Keniston 2005-11-22 01:06:58 UTC
Testing is in progress using the kernels you provided.  What constitutes
"absolute" test results?

Comment 10 Dave Anderson 2005-11-22 13:39:11 UTC
Thanks Jim -- we appreciate the extra effort -- and just a "thumbs up" on
the 5 kernels provided will be fine.

It's just that we can't simply go with testing on upstream
kernels only, and we're a little gun-shy after getting burnt by
the kprobes/gdb/hugemem fiasco...


Comment 11 Jim Keniston 2005-11-22 23:00:50 UTC
We have installed and tested the kernels you provided on the appropriate
architectures:
i686 smp - kevinrs.com
i686 hugemem - kevinrs.com
ppc64 - hien.com
ia64 - anil.s.keshavamurthy
x86_64 - jkenisto.com

The rp.c test (from Comment #1) passed on all architectures.

Comment 12 Dave Anderson 2005-11-23 19:21:13 UTC
Thanks Jim.  I've posted the patch internally for review.


Comment 17 Red Hat Bugzilla 2006-03-07 20:48:16 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0132.html