Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 872634

Summary: core_pattern helper not executed in same namespaces as crashing program
Product: Red Hat Enterprise Linux 7 Reporter: Daniel Berrangé <berrange>
Component: kernelAssignee: Neil Horman <nhorman>
Status: CLOSED NEXTRELEASE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.0CC: dwalsh, nhorman
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-12-19 16:21:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
[PATCH] coredump: move pipe helper into the namespace of crashing task
none
[PATCH] coredump: move pipe helper into the namespace of crashing task
none
tar.gz of a busybox based root
none
sample program to run setns from the command line none

Description Daniel Berrangé 2012-11-02 16:03:16 UTC
Description of problem:
When a process inside a container crashes, the core_pattern helper program is not executed in the same namespace(s) as the crashing program. Instead it is executed in the primary host OS namespace(s). This means core files are saved in the wrong root directory, and the helper runs with the wrong user/group IDs/etc.

Version-Release number of selected component (if applicable):
kernel-3.6.0-0.28.el7

How reproducible:
Always

Steps to Reproduce:

1. Create a script in the host filesystem

 #cat > /core.sh <<EOF
 PID=$1
 printenv > /tmp/core.$PID
 EOF

2. Setup core_pattern

sysctl -w kernel.core_pattern="| /core.sh %p" 

3. Start a container with a private root filesystem
4. Enter the container console
5. Enable core dumps
  # ulimit -c unlimited
6. Run a program
  # sleep 1000 &
7. Kill the program

  # kill -SEGV %1
  [1] + Segmentation fault (core dumped) sleep 100

Actual results:
Script output is written to /tmp/core.5 in the *host* /tmp

Expected results:
Script output is written to /tmp/core.5 in the *container* /tmp


Additional info:

Comment 1 Neil Horman 2012-11-16 18:55:59 UTC
Created attachment 646534 [details]
[PATCH] coredump: move pipe helper into the namespace of crashing task


If coredump_pattern is set to pipe data to a user space helper, we need to reset
the namespace of the coredump pipe reader to that of the crashing process, lest
we dump core to a filesystem location or other resource not within the same
container

Signed-off-by: Neil Horman <nhorman>
---
 fs/exec.c               | 8 ++++++++
 include/linux/binfmts.h | 1 +
 2 files changed, 9 insertions(+)

Comment 2 Neil Horman 2012-11-16 18:58:26 UTC
Created attachment 646535 [details]
[PATCH] coredump: move pipe helper into the namespace of crashing task


If coredump_pattern is set to pipe data to a user space helper, we need to reset
the namespace of the coredump pipe reader to that of the crashing process, lest
we dump core to a filesystem location or other resource not within the same
container

Signed-off-by: Neil Horman <nhorman>
---
 fs/exec.c               | 8 ++++++++
 include/linux/binfmts.h | 1 +
 2 files changed, 9 insertions(+)

Comment 3 Neil Horman 2012-11-16 19:12:03 UTC
please try the above patch in your test environment and confirm that it solves the problem at hand.

Comment 4 Neil Horman 2012-11-30 00:54:10 UTC
Ping Daniel, any feedback here?

Comment 5 Daniel Berrangé 2012-11-30 08:15:49 UTC
Sorry, the patch looks sane, but I'm not had time to try to do a kernel RPM build with it in. Will update as soon as I can.

Comment 6 Neil Horman 2012-11-30 14:26:32 UTC
copy that, thanks!

Comment 7 Daniel Berrangé 2012-11-30 17:33:49 UTC
Hmm, I did a kernel build with this patch applied, but it doesn't appear to have had any effect on the core dump behaviour. The core helper process still sees the host's /tmp and not the container's. Did it work in your own testing ?

Comment 8 Neil Horman 2012-11-30 18:35:19 UTC
I didn't have time to test it yet, thats why I gave you the patch. I'll take a look at it though and see if I can figure out whats going on.

Comment 9 Daniel Berrangé 2012-11-30 18:41:53 UTC
Created attachment 655182 [details]
tar.gz of a busybox based root

Ok, here's an easier way to test the scenario without using the full libvirt LXC stack.

Extract the attached tar.gz into /root/, so you end up with /root/mycontainer. Then run

  # systemd-nspawn -D /root/mycontainer /bin/sh

you should get a shell prompt inside the container, be able to see that your root filesystem and /tmp is different from the host's

Now on the host setup the core helper as I describe in first comment.

Then run 'sleep 10000' in the container prompt

And  kill -ABRT `pgrep sleep`  in the host.

The core ought to end up in /root/mycontainer/tmp, rather than /tmp

Comment 10 Neil Horman 2012-11-30 19:05:03 UTC
Ok, I'm sorry, this is different that what I thought you're description of the problem meant. I thought you were missing the namespace container in the helper task (which may be the case regardless), setting the fs root using chroot is something different, and should already be working

Given the test case above, what are you setting /proc/sys/kernel/core_pattern to?  I ask because This may be an abrt bug, as it uses its core_helper to connect to a daemon that may loose the chroot fs base (and IIRC it does so purposely).  If you set the core_pattern to something like "|cat >/tmp/file.core" it should work just fine.  Let me know what you're using as the core helper, and I'll let you know if its all working properly or not.

Comment 11 Daniel Berrangé 2012-11-30 19:23:31 UTC
This is my core helper

 # cat > /core.sh <<EOF
 PID=$1
 printenv > /tmp/core.$PID
 EOF


 # sysctl -w kernel.core_pattern="| /core.sh %p" 

This isn't merely a chroot. What  systemd-nspawn does is setup a private mount namespace (+ other namespaces) and then chroot to a new root filesystem inside that namespace.

Attaching the core helper to the mount namespace should make the core helper see the new /tmp inside the container.

Comment 12 Neil Horman 2012-12-03 16:12:50 UTC
ok, I've got it reproduced now, thanks

Comment 13 Neil Horman 2012-12-03 21:04:30 UTC
Crud, I just remembered something.  In fixing this up, I recall having a conversation about this very subject a few years ago.  I still think its a valid feature to have, but lots of people wanted the core helper to run outside of the chroot context, as it afforded more flexibility in how cores were handled.  The argument was that, if you wanted to have the pipe helper write to a given childs chroot space, you could write to /proc/<pid>/root/tmp, rather than to just /tmp.

I've got a working patch for this, but I'll have to augment it to allow it to be opt in (as we can't break the current behavior).  Do you still want to pursue this, or is writing via the root symlink in /proc above sufficient for your needs?

Comment 14 Daniel Berrangé 2012-12-04 09:24:56 UTC
What you say about being able to write coredumps via the /proc/pid/root symlink is correct, but that's only taking account of the filesystem namespace. I believe we need to consider semantics for other namespaces too. In particular with the user namespace, if the core helper is running with the host OS' namespace, then it is quite possible that the user inside the container will be unable to read the resulting corefile, because UID 5 inside the container will not be equal to UID 5 in the host. If the container has separate networking and wants to send the core file over the network to a crash server, this again requires core helpers to run in container context.

I can see why you came to the conclusion you did for coredump wrt traditional chroot() syscall usage, but I believe containers are a much more advanced beast which suggest a different answer. 

Longer term, I believe that kernel.core_pattern tunable will actually end up having to be made private to containers, so that the container admin has full control over core dump handling. It is hard to say which namespace this privatization would need to be associated with though. 

It could well be that we need to raise this whole issue on the containers mailing list for broader discussion before deciding on a solution for this bz

Comment 15 Neil Horman 2012-12-04 14:58:50 UTC
well, I'm actually with you on this.  I personally think that the core_pattern sysctl should be chroot private exclusively - that is to say that the path of the core pattern helper should always be taken relative to the container root fs. If that path doesn't exist in a given container, then the core dump should fail.  If you want to write to a location outside your chroot you need to setup communications to a global running daemon that has access to the appropriate location.  I think the solution I have is one that we can make work that way.  I think what I'll do is augment the patch to create an additional sysctl that directs us to either make the core_pattern sysctl global or chroot relative (so as to maintain backward compatibility).  I'll post it as an RFC, CC-ing the containers list and lkml to solicit opinions on it.

Comment 16 Neil Horman 2012-12-04 21:19:03 UTC
Ok, I've posted the RFC patch to lkml and the containers list (CC'ed you), lets see what feedback we get on it.

Comment 17 Neil Horman 2012-12-04 21:19:34 UTC
http://marc.info/?l=linux-kernel&m=135465554510510&w=2
For anyone wanting to follow

Comment 18 Neil Horman 2012-12-11 20:15:10 UTC
http://marc.info/?l=linux-kernel&m=135525604319158&w=2

New post, as akpm asked me to repost after 3.7 was released.  Daniel, if you want to chime in and ack it, that would be great :)

Comment 19 Neil Horman 2012-12-15 16:38:36 UTC
looks like a new version of this has landed in the  -mm tree, I'll pull that back shortly

Comment 20 Neil Horman 2012-12-17 15:21:53 UTC
ugh, oleg has found some additional issues to go over.  This will have to wait on that.

Comment 21 Neil Horman 2012-12-19 13:29:41 UTC
Ok, after some more discussion, I think we (hopefully) have no more work to do here.

Eric Biederman just merged this changeset into Linus' tree:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=6a2b60b17b3e48a418695a94bd2420f6ab32e519

And that should be available in 3.8 which I think will make RHEL7 (if not we can backport it).

These changes complete the work needed to allow a userspace process to convert all namespaces its in via the setns call (not just the net uts and mount namespaces).  So we can effectively use a global core_pattern to handle core pipes for all containers.  Its just the responsibility of the core_pattern application to make the swtich to the namespaces of the crashed process that its reading from.

The only work (arguably) left to do is to make core_pattern a per-namespace tunable, which is a much more tricky proposition, as theres not a particular namespace that core_pattern naturally belongs to.

I'll continue monitoring this, and either use this bug to backport the above changeset to RHEL7 or close it when/if we merge 3.8

Comment 22 Daniel Berrangé 2012-12-19 13:38:24 UTC
I wonder if there is a 'setns' command line utility that allows you to launch a process, attaching to a namespace eg so you could set core_pattern to

   setns %p /my/helper/program

I know there is an 'unshare' command in util-linux, but not found a 'setns' command yet. If one doesn't already exist, we might want to treat that as an RFE for util-linux to make this easier. Oh, and perhaps extend the 'core' man page to mention how to make the core helper run in a namespace

Comment 23 Neil Horman 2012-12-19 14:53:10 UTC
Currently there isn't any such command line utility, but there easily could be.  It would work in the same vein that taskset of numactl does, in which it would run, switch namespaces as specified, and then exec the requested program.

Comment 24 Neil Horman 2012-12-19 16:20:35 UTC
Created attachment 666220 [details]
sample program to run setns from the command line

Heres a quick and dirty (read: untested) setns command line program.  You still need root privs to run it (although that will go away in RHEL7).  But it allows you to specify the ipc/net/uts namespace of the process you want to migrate into.  That, coupled with the /proc/pid/root symlink, should let you migrate core-pattern to whatever namespace you want.  When Eric's new patch set gets pulled into RHEL7, this utiilty can be enhanced to migrate us to the pid and mount namespaces as well.