RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 643866 - stap script generates errors when using clone() with various namespace flags
Summary: stap script generates errors when using clone() with various namespace flags
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: systemtap
Version: 6.1
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Frank Ch. Eigler
QA Contact: qe-baseos-tools-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-10-18 10:41 UTC by Daniel Berrangé
Modified: 2011-05-19 13:54 UTC (History)
8 users (show)

Fixed In Version: systemtap-1.4-1.el6
Doc Type: Bug Fix
Doc Text:
Clone Of: 634242
Environment:
Last Closed: 2011-05-19 13:54:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:0651 0 normal SHIPPED_LIVE systemtap bug fix and enhancement update 2011-05-19 09:37:25 UTC

Description Daniel Berrangé 2010-10-18 10:41:09 UTC
+++ This bug was initially created as a clone of Bug #634242 +++

Description of problem:
When testing the 'client.stp' script that is part of libvirtd dtrace/systemtap support in 

  https://bugzilla.redhat.com/show_bug.cgi?id=552387

When libvirtd starts up the script will print out many errors

WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x415b6f)' addr 0000000000415b6f rc -3
WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x414d65)' addr 0000000000414d65 rc -3
WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x415b78)' addr 0000000000415b78 rc -3
WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x414249)' addr 0000000000414249 rc -3
WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x414259)' addr 0000000000414259 rc -3
WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x413ff2)' addr 0000000000413ff2 rc -3
WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x415b8e)' addr 0000000000415b8e rc -3
WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x41634e)' addr 000000000041634e rc -3
WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x422a20)' addr 0000000000422a20 rc -3
WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x4235fe)' addr 00000000004235fe rc -3
WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x4239ce)' addr 00000000004239ce rc -3
WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x4229e0)' addr 00000000004229e0 rc -3
WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x42358b)' addr 000000000042358b rc -3
WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x42395b)' addr 000000000042395b rc -3


By inserting some fprintf + sleep calls into libvirtd source, I determined that it seems to be the LXC driver clone() calls that trigger these problems.

Specifically this code in libvirt appears to make stap unhappy:



static int lxcContainerDummyChild(void *argv ATTRIBUTE_UNUSED)
{
    _exit(0);
}

int lxcContainerAvailable(int features)
{
    int flags = CLONE_NEWPID|CLONE_NEWNS|CLONE_NEWUTS|
        CLONE_NEWIPC|SIGCHLD;
    int cpid;
    char *childStack;
    char *stack;
    int childStatus;

    if (features & LXC_CONTAINER_FEATURE_USER)
        flags |= CLONE_NEWUSER;

    if (features & LXC_CONTAINER_FEATURE_NET)
        flags |= CLONE_NEWNET;

    if (VIR_ALLOC_N(stack, getpagesize() * 4) < 0) {
        DEBUG0("Unable to allocate stack");
        return -1;
    }

    childStack = stack + (getpagesize() * 4);

    cpid = clone(lxcContainerDummyChild, childStack, flags, NULL);
    VIR_FREE(stack);
    if (cpid < 0) {
        char ebuf[1024];
        DEBUG("clone call returned %s, container support is not enabled",
              virStrerror(errno, ebuf, sizeof ebuf));
        return -1;
    } else {
        waitpid(cpid, &childStatus, 0);
    }

    return 0;
}


I expect one of those CLONE_* flags is causing violation of some assumption that systemtap has.


Version-Release number of selected component (if applicable):
systemtap-1.3-2.fc13.x86_64
kernel 2.6.34.6-54.fc13.x86_64
libvirt GIT latst + patches from BZ above.

How reproducible:
Always, if libvirtd has its 'LXC' driver enabled.

Steps to Reproduce:
1. In one shell run 'stap client.stp' from bz 552387
2. In another shell run /usr/sbin/libvirtd as root, again with patches from bz 552387
3.
  
Actual results:
Missing probe warnings

Expected results:
No probe warnings

Additional info:

--- Additional comment from roland on 2010-09-15 13:56:08 EDT ---

I'm sure it's CLONE_NEWPID.  uprobes internals has pid lookups, which is fundamentally broken.  It needs to be fixed to use task_struct or struct pid as keys or something like that.

--- Additional comment from fche on 2010-09-17 10:37:46 EDT ---

roland advises this may be sufficient.  There are two other places in stap
where find_task_by_pid is used; it would be good to check whether those
need changing too.

diff --git a/runtime/uprobes/uprobes.c b/runtime/uprobes/uprobes.c
index 403de18..3f76ec6 100644
--- a/runtime/uprobes/uprobes.c
+++ b/runtime/uprobes/uprobes.c
@@ -876,7 +876,7 @@ static struct task_struct *uprobe_get_task(pid_t pid)
 {
 	struct task_struct *p;
 	rcu_read_lock();
-	p = find_task_by_pid(pid);
+	p = find_task_by_pid_ns(pid, &init_pid_ns);
 	if (p)
 		get_task_struct(p);
 	rcu_read_unlock();

--- Additional comment from berrange on 2010-09-17 13:17:29 EDT ---

I made that change to /usr/share/systemtap/runtime/uprobes/uprobes.c and deleted the cached kernel module so it re-built, but nothing appears to change. Is that file really still used ?  AFAICT, only the files in runtime/uprobes2/ are actually being compiled on my current host.

--- Additional comment from mjw on 2010-09-19 10:23:16 EDT ---

(In reply to comment #3)
> I made that change to /usr/share/systemtap/runtime/uprobes/uprobes.c and
> deleted the cached kernel module so it re-built, but nothing appears to change.
> Is that file really still used ?  AFAICT, only the files in runtime/uprobes2/
> are actually being compiled on my current host.

Yes, you are right, on modern kernels only uprobes2 is used. It seems this is also not fully namespace aware (it uses find_vpid for example which I believe isn't namespace aware). But I am not sure what all the necessary changes are to make it so.

--- Additional comment from dsmith on 2010-09-20 14:42:53 EDT ---

Here's a small update.  I've duplicated this problem.  I've also discovered that specifying 'CLONE_NEWPID' (as was suspected) is certainly the problem.  Without 'CLONE_NEWPID', the error doesn't occur.

--- Additional comment from dsmith on 2010-10-06 15:23:12 EDT ---

I've fixed this upstream in several commits.

86229a5 fixes the problem for current kernels:

<http://sources.redhat.com/git/gitweb.cgi?p=systemtap.git;a=patch;h=86229a5533de13b6ac6eeb34d9ea24e7cfb64faa>

e5a338c fixed the problem for rhel5-era kernels:

<http://sources.redhat.com/git/gitweb.cgi?p=systemtap.git;a=patch;h=e5a338c3a2aeb1d5dfa27f4d30dd04bfd8c61ce4>

0ac3dce added a test case that tests those CLONE_* flags.

<http://sources.redhat.com/git/gitweb.cgi?p=systemtap.git;a=patch;h=0ac3dce18d9bcadff5f2f5f9274a7b40889d1d1a>

There were actually 2 related problems here:

- When CLONE_NEWPID was used, systemtap was looking for the pid in the private pid namespace, not the public one.

- When CLONE_VM was used, uprobe probes got removed in the newly cloned process.

--- Additional comment from berrange on 2010-10-18 06:39:23 EDT ---

I have confirmed that changeset 86229a5 applied to the current F13 RPM fixes the problem I see with libvirt + LXC.

Comment 1 Daniel Berrangé 2010-10-18 10:41:56 UTC
Cloned this bug from Fedora, because in RHEL-6.1 we intend to introduce systemtap/dtrace support in libvirt, and thus will hit this bug in systemtap for RHEL.

Comment 3 Frank Ch. Eigler 2010-10-18 11:55:17 UTC
Patch in hand, should be possible to either backport to rhel6.0 version of stap or a future rebase as per bug #634995.

Comment 6 errata-xmlrpc 2011-05-19 13:54:46 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0651.html


Note You need to log in before you can comment on or make changes to this bug.