Description of problem: When testing the 'client.stp' script that is part of libvirtd dtrace/systemtap support in https://bugzilla.redhat.com/show_bug.cgi?id=552387 When libvirtd starts up the script will print out many errors WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x415b6f)' addr 0000000000415b6f rc -3 WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x414d65)' addr 0000000000414d65 rc -3 WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x415b78)' addr 0000000000415b78 rc -3 WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x414249)' addr 0000000000414249 rc -3 WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x414259)' addr 0000000000414259 rc -3 WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x413ff2)' addr 0000000000413ff2 rc -3 WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x415b8e)' addr 0000000000415b8e rc -3 WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x41634e)' addr 000000000041634e rc -3 WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x422a20)' addr 0000000000422a20 rc -3 WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x4235fe)' addr 00000000004235fe rc -3 WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x4239ce)' addr 00000000004239ce rc -3 WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x4229e0)' addr 00000000004229e0 rc -3 WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x42358b)' addr 000000000042358b rc -3 WARNING: u*probe failed libvirtd[8155] 'process("/usr/sbin/libvirtd").statement(0x42395b)' addr 000000000042395b rc -3 By inserting some fprintf + sleep calls into libvirtd source, I determined that it seems to be the LXC driver clone() calls that trigger these problems. Specifically this code in libvirt appears to make stap unhappy: static int lxcContainerDummyChild(void *argv ATTRIBUTE_UNUSED) { _exit(0); } int lxcContainerAvailable(int features) { int flags = CLONE_NEWPID|CLONE_NEWNS|CLONE_NEWUTS| CLONE_NEWIPC|SIGCHLD; int cpid; char *childStack; char *stack; int childStatus; if (features & LXC_CONTAINER_FEATURE_USER) flags |= CLONE_NEWUSER; if (features & LXC_CONTAINER_FEATURE_NET) flags |= CLONE_NEWNET; if (VIR_ALLOC_N(stack, getpagesize() * 4) < 0) { DEBUG0("Unable to allocate stack"); return -1; } childStack = stack + (getpagesize() * 4); cpid = clone(lxcContainerDummyChild, childStack, flags, NULL); VIR_FREE(stack); if (cpid < 0) { char ebuf[1024]; DEBUG("clone call returned %s, container support is not enabled", virStrerror(errno, ebuf, sizeof ebuf)); return -1; } else { waitpid(cpid, &childStatus, 0); } return 0; } I expect one of those CLONE_* flags is causing violation of some assumption that systemtap has. Version-Release number of selected component (if applicable): systemtap-1.3-2.fc13.x86_64 kernel 2.6.34.6-54.fc13.x86_64 libvirt GIT latst + patches from BZ above. How reproducible: Always, if libvirtd has its 'LXC' driver enabled. Steps to Reproduce: 1. In one shell run 'stap client.stp' from bz 552387 2. In another shell run /usr/sbin/libvirtd as root, again with patches from bz 552387 3. Actual results: Missing probe warnings Expected results: No probe warnings Additional info:
I'm sure it's CLONE_NEWPID. uprobes internals has pid lookups, which is fundamentally broken. It needs to be fixed to use task_struct or struct pid as keys or something like that.
roland advises this may be sufficient. There are two other places in stap where find_task_by_pid is used; it would be good to check whether those need changing too. diff --git a/runtime/uprobes/uprobes.c b/runtime/uprobes/uprobes.c index 403de18..3f76ec6 100644 --- a/runtime/uprobes/uprobes.c +++ b/runtime/uprobes/uprobes.c @@ -876,7 +876,7 @@ static struct task_struct *uprobe_get_task(pid_t pid) { struct task_struct *p; rcu_read_lock(); - p = find_task_by_pid(pid); + p = find_task_by_pid_ns(pid, &init_pid_ns); if (p) get_task_struct(p); rcu_read_unlock();
I made that change to /usr/share/systemtap/runtime/uprobes/uprobes.c and deleted the cached kernel module so it re-built, but nothing appears to change. Is that file really still used ? AFAICT, only the files in runtime/uprobes2/ are actually being compiled on my current host.
(In reply to comment #3) > I made that change to /usr/share/systemtap/runtime/uprobes/uprobes.c and > deleted the cached kernel module so it re-built, but nothing appears to change. > Is that file really still used ? AFAICT, only the files in runtime/uprobes2/ > are actually being compiled on my current host. Yes, you are right, on modern kernels only uprobes2 is used. It seems this is also not fully namespace aware (it uses find_vpid for example which I believe isn't namespace aware). But I am not sure what all the necessary changes are to make it so.
Here's a small update. I've duplicated this problem. I've also discovered that specifying 'CLONE_NEWPID' (as was suspected) is certainly the problem. Without 'CLONE_NEWPID', the error doesn't occur.
I've fixed this upstream in several commits. 86229a5 fixes the problem for current kernels: <http://sources.redhat.com/git/gitweb.cgi?p=systemtap.git;a=patch;h=86229a5533de13b6ac6eeb34d9ea24e7cfb64faa> e5a338c fixed the problem for rhel5-era kernels: <http://sources.redhat.com/git/gitweb.cgi?p=systemtap.git;a=patch;h=e5a338c3a2aeb1d5dfa27f4d30dd04bfd8c61ce4> 0ac3dce added a test case that tests those CLONE_* flags. <http://sources.redhat.com/git/gitweb.cgi?p=systemtap.git;a=patch;h=0ac3dce18d9bcadff5f2f5f9274a7b40889d1d1a> There were actually 2 related problems here: - When CLONE_NEWPID was used, systemtap was looking for the pid in the private pid namespace, not the public one. - When CLONE_VM was used, uprobe probes got removed in the newly cloned process.
I have confirmed that changeset 86229a5 applied to the current F13 RPM fixes the problem I see with libvirt + LXC.
systemtap 1.4 includes the above fixes, and is available in rawhide and in update-testing for earlier fedoras.