RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 655908 - hang/crash on massive kprobing
Summary: hang/crash on massive kprobing
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.0
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: rc
: ---
Assignee: Petr Oros
QA Contact: Ziqian SUN (Zamir)
URL:
Whiteboard:
: 985734 (view as bug list)
Depends On: 655904
Blocks: 831833 846704 1270638 1359574 1496722
TreeView+ depends on / blocked
 
Reported: 2010-11-22 16:58 UTC by Frank Ch. Eigler
Modified: 2017-12-06 10:34 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 655904
: 831833 (view as bug list)
Environment:
Last Closed: 2017-12-06 10:34:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 985734 1 None None None 2021-01-20 06:05:38 UTC

Internal Links: 985734

Description Frank Ch. Eigler 2010-11-22 16:58:33 UTC
+++ This bug was initially created as a clone of Bug #655904 +++
(This applies to all kernel versions > 2.6.9 we've ever seen.)

A longstanding problem in the linux kernel has been its failure to protect
itself against massive kprobe sessions, such as with systemtap scripts such as:

   probe kernel.function("*") {}

The important thing to note is that systemtap is not required to show
this problem.  "perf probe" can do it, as can the following recipe, which
builds an absolutely minimal kprobes-using kernel module, and applies it
to function entry points (as gleamed form /proc/kallsyms).  (With systemtap,
we can easily place probes into the bodies of functions too, and of course
that crashes even "harder", but let's leave that till later.)

  git clone git://sourceware.org/git/systemtap.git
  cd systemtap/scripts/kprobes_test
  sh gen_code_all.sh
  insmod kprobe_module.ko
  <bang>

There may be multiple causes, such as inadequate __kprobes markup, or
exception handling, or unknown factors.

See also

http://sourceware.org/PR275
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=604453

Comment 2 Masami Hiramatsu 2010-12-16 13:09:43 UTC
>   probe kernel.function("*") {}

Does this systemtap script include all inlined functions too?
If no, we can test it on the kernel which supports kprobe-tracer of ftrace.
Following receipt should causes a kernel panic.

# sort /proc/kallsyms | egrep '[0-9a-f]+ [Tt] [^[]*$' | c -f 3 -d" " > syms.list
# for i in `cat syms.list`; do echo "p $i" >> /sys/kernel/debug/tracing/kprobe_events ;done
# echo 1 >  /sys/kernel/debug/tracing/events/enable

Comment 3 RHEL Program Management 2011-01-07 04:22:07 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 4 Suzanne Logcher 2011-01-07 16:19:21 UTC
This request was erroneously denied for the current release of Red Hat
Enterprise Linux.  The error has been fixed and this request has been
re-proposed for the current release.

Comment 5 RHEL Program Management 2011-02-01 05:53:07 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 6 RHEL Program Management 2011-02-01 18:53:20 UTC
This request was erroneously denied for the current release of
Red Hat Enterprise Linux.  The error has been fixed and this
request has been re-proposed for the current release.

Comment 10 RHEL Program Management 2011-10-07 15:18:10 UTC
Since RHEL 6.2 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 12 Masami Hiramatsu 2015-06-16 10:40:29 UTC
Please check my slide at the last year's LinuxCon Japan.

http://events.linuxfoundation.jp/sites/events/files/slides/Handling%20the%20Massive%20Multiple%20Kprobes%20v2_1.pdf

Unfortunately, that enhance is not accepted on upstream yet...

Comment 13 Pratyush Anand 2015-07-16 06:35:26 UTC
(In reply to Masami Hiramatsu from comment #12)
> Please check my slide at the last year's LinuxCon Japan.
> 
> http://events.linuxfoundation.jp/sites/events/files/slides/
> Handling%20the%20Massive%20Multiple%20Kprobes%20v2_1.pdf
> 

Thanks a lot for the pointer. So basically there are two aspects of this bug:

a) When massive kprobe is enabled, system crashes.
b) When massive kprobe is enabled, system becomes extremely slow. 

'a' would mostly be arch specific and need to blacklist all those symbols which are not kprobable (mainly entry routines and subroutines which comes in path of exception used for breakpoint and single step handling).

> Unfortunately, that enhance is not accepted on upstream yet...

and 'b' should be resolved upto very much extent with your enhancement. Thanks for revisit to this patch series. I have rebased them to latest fedora-arm64 kernel [2] and tested with my ARM64 board.

I was not subscribed to systemtap mailing list (I just subscribed it). However, I noticed your reply [1]. 

[1] https://www.sourceware.org/ml/systemtap/2015-q3/msg00039.html
[2] https://github.com/pratyushanand/linux.git:fedora_arm64_uprobe_devel (880df93e2dac)

Comment 14 Pratyush Anand 2015-11-06 12:34:45 UTC
Sorry for coming back late on it. Did work a bit for last few days, specifically on possibility of analyzing hot spots when massive kprobe is instrumented on ARM64 platform (As requested by Masami in [1]).
We still had some issues of crash on ARM64(point 'a' in comment 13), which has been resolved upto some extent now. However, `perf report` did not show any load for arm64 kprobe_breakpoint_handler when massive kprobes were instrumented.  It did not show any load, because current arm64 implementation does not support interrupt generation(and so PMU events) when we are handling debug exception.

Having said that, there could be ways to do some patches and support profiling of kprobe handler on ARM64. I can work on that item, but it will take some time. However, that work is not directly related to resolution of this bug. 

Masami, since above work may take some time. Therefore, can I help in someway on top of the work [1, 2] which you have already done, so that up-streaming of patches for the resolution of this BZ becomes a bit faster. 

[1] https://www.sourceware.org/ml/systemtap/2015-q3/msg00039.html
[2] https://lkml.org/lkml/2015/7/16/70

Comment 15 Ziqian SUN (Zamir) 2016-08-12 01:37:41 UTC
*** Bug 985734 has been marked as a duplicate of this bug. ***

Comment 26 Jan Kurik 2017-12-06 10:34:44 UTC
Red Hat Enterprise Linux 6 is in the Production 3 Phase. During the Production 3 Phase, Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available.

The official life cycle policy can be reviewed here:

http://redhat.com/rhel/lifecycle

This issue does not meet the inclusion criteria for the Production 3 Phase and will be marked as CLOSED/WONTFIX. If this remains a critical requirement, please contact Red Hat Customer Support to request a re-evaluation of the issue, citing a clear business justification. Note that a strong business justification will be required for re-evaluation. Red Hat Customer Support can be contacted via the Red Hat Customer Portal at the following URL:

https://access.redhat.com/


Note You need to log in before you can comment on or make changes to this bug.