566460 – kernel 2.6.33 strips coredump when using pipe in core_pattern

Bug 566460 - kernel 2.6.33 strips coredump when using pipe in core_pattern

Summary: kernel 2.6.33 strips coredump when using pipe in core_pattern

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	13
Hardware:	All
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Assignee:	Neil Horman
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	F13Alpha, F13AlphaBlocker
TreeView+	depends on / blocked

Reported:	2010-02-18 14:41 UTC by Jiri Moskovcak
Modified:	2015-02-01 22:51 UTC (History)
CC List:	12 users (show)
Fixed In Version:	kernel-2.6.33-0.47.rc8.git1.fc13
Clone Of:
Environment:
Last Closed:	2010-02-24 19:14:31 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
patch to skip uid check in do_coredump (588 bytes, patch) 2010-02-22 17:53 UTC, Neil Horman	no flags	Details \| Diff
View All

Description Jiri Moskovcak 2010-02-18 14:41:16 UTC

Description of problem:
the new kernel strips the coredump when the is a pipe in core_pattern

Version-Release number of selected component (if applicable):
2.6.33

How reproducible:
100%

Steps to Reproduce:
1. install a simple hook to core_pattern
2. set core_pipe_limit to != 0 (I use 4)
3. kill some app with SEGV
4. the hook is invoked, but the saved coredump has 0 size
  
Actual results:
empty coredump

Expected results:
non-empty coredump

Additional info:
I'm in the middle of testing various kernel's version(+patches) will post the results in a while to help narrowing this down.

Comment 1 Jiri Moskovcak 2010-02-18 15:13:40 UTC

My test results: packages were taken from Fedora cvs, built in koji(.32 in
brew).

== 2.6.32 - without umh-refactor patch: ==

$ cat /proc/sys/kernel/core_pipe_limit 
4

$ cat /proc/sys/kernel/core_pattern 
|/usr/libexec/abrt-hook-ccpp /var/cache/abrt %p %s %u %c

result: hook was able to write the coredump

== 2.6.32-with the patch ==
$ cat /proc/sys/kernel/core_pipe_limit 
4

$ cat /proc/sys/kernel/core_pattern 
|/usr/libexec/abrt-hook-ccpp /var/cache/abrt %p %s %u %c

- no coredump gets to helper
- setting ulimit -c doesn't help

== kernel-2.6.33-0.47.rc8.git1 ==
0 size coredump, ulimit -c doesn't help

== kernel-2.6.33-0.47.rc8.git1 without-umh-refactor ==
works fine when ulimit -c is set

Comment 2 James Laska 2010-02-18 17:04:23 UTC

Adding to the F13Alpha blocker list for review at the next blocker review meeting.  Jiri noted on IRC that this affects only C/C++ program failures.  Python and kerneloops failures will still be caught.

The Alpha release criteria [1] do not explicitly call out that ABRT must be able to capture and report failures to Bugzilla, but a similar criteria exists for the installer.

[1] https://fedoraproject.org/wiki/Fedora_13_Alpha_Release_Criteria

Comment 3 Neil Horman 2010-02-18 18:19:59 UTC

I'm pretty sure this was introduced w/ andi kleens work that I sucekd in with that uhm-refactor.  I just tried it on the latest -mm and get the same results.

Comment 4 Karel Klíč 2010-02-19 20:23:56 UTC

Looked at the uhm-refactor.patch. 
Just guessing how it might work. 
Please ignore if I am completely wrong.

umh_pipe_setup() in exec.c calls create_write_pipe() and create_read_pipe(). Those calls were previously done in "main" thread, but now umh_pipe_setup() is called in ____call_usermodehelper thread (the number of underscores in the name is important here).

__call_usermodehelper() in kmod.c calls kernel_thread(____call_usermodehelper), and in the case of pipes it is called _without_ CLONE_FS and CLONE_FILES flags. Previously that worked, because the pipes were created in the main thread, and the child process inherited a copy of them. Now that does not work, because the pipes are created in the child thread, and that does NOT affect the main thread, which dumps the core. The core is not written to the write side of the pipe.

So I would try to add CLONE_FILES and CLONE_FS flags to the second kernel_thread() call in the __call_usermodehelper() function in kmod.c.

Comment 5 Neil Horman 2010-02-19 20:33:10 UTC

Good analysis, but I'm not sure its accurate, given that the whole setup works properly, just as long as we don't set core_pipe_limit to a non-zero value.  I'm not sure what the interaction there is.

Comment 6 Neil Horman 2010-02-19 21:01:39 UTC

Hmm, this is odd, I thought I had re-created the problem upstream, but not that I try it with the latest -mm the problem seems gone.  I'm going to re-install with the latest rawhide kernel and debug from there.

Comment 7 Neil Horman 2010-02-21 18:25:37 UTC

so, I figured out how I reproduced this previously.  I was testing with abrt specifically.  I just tried the latest upstream -mm tree and rawhide with a simplified core collector, and everything is working fine:

cat /usr/bin/catch_core
#!/bin/sh
/usr/bin/logger -s "SLEEPING"
sleep 10

/usr/bin/logger -s "CATCHING CORE"


cat >> /tmp/newcore
####End /usr/bin/catch_core





echo "|/usr/bin/catch_core" > /proc/sys/kernel/core_pattern
echo 4 > /proc/sys/kernel/core_pipe_limit

if I crash a process with this setup, I can get a core file in /tmp/newcore that is full sized and recognizable to crash consistently.

This leads me to believe that the problem is in ABRT.

Comment 8 Jiri Moskovcak 2010-02-21 22:55:06 UTC

I tried you script with this results: if I crash something under root I get full coredump, but if I try it as a non-root, I get zero size coredump. The same applies for the ABRT's hook.

J.

Comment 9 Neil Horman 2010-02-22 14:10:55 UTC

dang it, apparently yes, you're supposed to need to run the core_collector as root (i.e. suid), but apparently thats not working now either.

Comment 10 Neil Horman 2010-02-22 17:53:49 UTC

Created attachment 395526 [details]
patch to skip uid check in do_coredump

found the problem.   Additional check in do_coredump tests the value of the process uid against the fsuid to make sure they match.  Thats relevant for files (to prevent ownership hacks and sealing of information out of cores), but irrelevant for pipes.  This patch fixes the issue

Comment 11 Neil Horman 2010-02-22 17:57:03 UTC

comitted to rawhide.  I'll need to send this to -mm as well

Note You need to log in before you can comment on or make changes to this bug.