Bug 872634
| Summary: | core_pattern helper not executed in same namespaces as crashing program | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Daniel Berrangé <berrange> |
| Component: | kernel | Assignee: | Neil Horman <nhorman> |
| Status: | CLOSED NEXTRELEASE | QA Contact: | Red Hat Kernel QE team <kernel-qe> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.0 | CC: | dwalsh, nhorman |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-12-19 16:21:08 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Attachments: | |||
|
Description
Daniel Berrangé
2012-11-02 16:03:16 UTC
Created attachment 646534 [details]
[PATCH] coredump: move pipe helper into the namespace of crashing task
If coredump_pattern is set to pipe data to a user space helper, we need to reset
the namespace of the coredump pipe reader to that of the crashing process, lest
we dump core to a filesystem location or other resource not within the same
container
Signed-off-by: Neil Horman <nhorman>
---
fs/exec.c | 8 ++++++++
include/linux/binfmts.h | 1 +
2 files changed, 9 insertions(+)
Created attachment 646535 [details]
[PATCH] coredump: move pipe helper into the namespace of crashing task
If coredump_pattern is set to pipe data to a user space helper, we need to reset
the namespace of the coredump pipe reader to that of the crashing process, lest
we dump core to a filesystem location or other resource not within the same
container
Signed-off-by: Neil Horman <nhorman>
---
fs/exec.c | 8 ++++++++
include/linux/binfmts.h | 1 +
2 files changed, 9 insertions(+)
please try the above patch in your test environment and confirm that it solves the problem at hand. Ping Daniel, any feedback here? Sorry, the patch looks sane, but I'm not had time to try to do a kernel RPM build with it in. Will update as soon as I can. copy that, thanks! Hmm, I did a kernel build with this patch applied, but it doesn't appear to have had any effect on the core dump behaviour. The core helper process still sees the host's /tmp and not the container's. Did it work in your own testing ? I didn't have time to test it yet, thats why I gave you the patch. I'll take a look at it though and see if I can figure out whats going on. Created attachment 655182 [details]
tar.gz of a busybox based root
Ok, here's an easier way to test the scenario without using the full libvirt LXC stack.
Extract the attached tar.gz into /root/, so you end up with /root/mycontainer. Then run
# systemd-nspawn -D /root/mycontainer /bin/sh
you should get a shell prompt inside the container, be able to see that your root filesystem and /tmp is different from the host's
Now on the host setup the core helper as I describe in first comment.
Then run 'sleep 10000' in the container prompt
And kill -ABRT `pgrep sleep` in the host.
The core ought to end up in /root/mycontainer/tmp, rather than /tmp
Ok, I'm sorry, this is different that what I thought you're description of the problem meant. I thought you were missing the namespace container in the helper task (which may be the case regardless), setting the fs root using chroot is something different, and should already be working Given the test case above, what are you setting /proc/sys/kernel/core_pattern to? I ask because This may be an abrt bug, as it uses its core_helper to connect to a daemon that may loose the chroot fs base (and IIRC it does so purposely). If you set the core_pattern to something like "|cat >/tmp/file.core" it should work just fine. Let me know what you're using as the core helper, and I'll let you know if its all working properly or not. This is my core helper # cat > /core.sh <<EOF PID=$1 printenv > /tmp/core.$PID EOF # sysctl -w kernel.core_pattern="| /core.sh %p" This isn't merely a chroot. What systemd-nspawn does is setup a private mount namespace (+ other namespaces) and then chroot to a new root filesystem inside that namespace. Attaching the core helper to the mount namespace should make the core helper see the new /tmp inside the container. ok, I've got it reproduced now, thanks Crud, I just remembered something. In fixing this up, I recall having a conversation about this very subject a few years ago. I still think its a valid feature to have, but lots of people wanted the core helper to run outside of the chroot context, as it afforded more flexibility in how cores were handled. The argument was that, if you wanted to have the pipe helper write to a given childs chroot space, you could write to /proc/<pid>/root/tmp, rather than to just /tmp. I've got a working patch for this, but I'll have to augment it to allow it to be opt in (as we can't break the current behavior). Do you still want to pursue this, or is writing via the root symlink in /proc above sufficient for your needs? What you say about being able to write coredumps via the /proc/pid/root symlink is correct, but that's only taking account of the filesystem namespace. I believe we need to consider semantics for other namespaces too. In particular with the user namespace, if the core helper is running with the host OS' namespace, then it is quite possible that the user inside the container will be unable to read the resulting corefile, because UID 5 inside the container will not be equal to UID 5 in the host. If the container has separate networking and wants to send the core file over the network to a crash server, this again requires core helpers to run in container context. I can see why you came to the conclusion you did for coredump wrt traditional chroot() syscall usage, but I believe containers are a much more advanced beast which suggest a different answer. Longer term, I believe that kernel.core_pattern tunable will actually end up having to be made private to containers, so that the container admin has full control over core dump handling. It is hard to say which namespace this privatization would need to be associated with though. It could well be that we need to raise this whole issue on the containers mailing list for broader discussion before deciding on a solution for this bz well, I'm actually with you on this. I personally think that the core_pattern sysctl should be chroot private exclusively - that is to say that the path of the core pattern helper should always be taken relative to the container root fs. If that path doesn't exist in a given container, then the core dump should fail. If you want to write to a location outside your chroot you need to setup communications to a global running daemon that has access to the appropriate location. I think the solution I have is one that we can make work that way. I think what I'll do is augment the patch to create an additional sysctl that directs us to either make the core_pattern sysctl global or chroot relative (so as to maintain backward compatibility). I'll post it as an RFC, CC-ing the containers list and lkml to solicit opinions on it. Ok, I've posted the RFC patch to lkml and the containers list (CC'ed you), lets see what feedback we get on it. http://marc.info/?l=linux-kernel&m=135465554510510&w=2 For anyone wanting to follow http://marc.info/?l=linux-kernel&m=135525604319158&w=2 New post, as akpm asked me to repost after 3.7 was released. Daniel, if you want to chime in and ack it, that would be great :) looks like a new version of this has landed in the -mm tree, I'll pull that back shortly ugh, oleg has found some additional issues to go over. This will have to wait on that. Ok, after some more discussion, I think we (hopefully) have no more work to do here. Eric Biederman just merged this changeset into Linus' tree: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=6a2b60b17b3e48a418695a94bd2420f6ab32e519 And that should be available in 3.8 which I think will make RHEL7 (if not we can backport it). These changes complete the work needed to allow a userspace process to convert all namespaces its in via the setns call (not just the net uts and mount namespaces). So we can effectively use a global core_pattern to handle core pipes for all containers. Its just the responsibility of the core_pattern application to make the swtich to the namespaces of the crashed process that its reading from. The only work (arguably) left to do is to make core_pattern a per-namespace tunable, which is a much more tricky proposition, as theres not a particular namespace that core_pattern naturally belongs to. I'll continue monitoring this, and either use this bug to backport the above changeset to RHEL7 or close it when/if we merge 3.8 I wonder if there is a 'setns' command line utility that allows you to launch a process, attaching to a namespace eg so you could set core_pattern to setns %p /my/helper/program I know there is an 'unshare' command in util-linux, but not found a 'setns' command yet. If one doesn't already exist, we might want to treat that as an RFE for util-linux to make this easier. Oh, and perhaps extend the 'core' man page to mention how to make the core helper run in a namespace Currently there isn't any such command line utility, but there easily could be. It would work in the same vein that taskset of numactl does, in which it would run, switch namespaces as specified, and then exec the requested program. Created attachment 666220 [details]
sample program to run setns from the command line
Heres a quick and dirty (read: untested) setns command line program. You still need root privs to run it (although that will go away in RHEL7). But it allows you to specify the ipc/net/uts namespace of the process you want to migrate into. That, coupled with the /proc/pid/root symlink, should let you migrate core-pattern to whatever namespace you want. When Eric's new patch set gets pulled into RHEL7, this utiilty can be enhanced to migrate us to the pid and mount namespaces as well.
|