Bug 759213 - Truncated core dumps generated when piped through custom hook
Summary: Truncated core dumps generated when piped through custom hook
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Oleg Nesterov
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 678807 (view as bug list)
Depends On:
Blocks: 730169 773215
TreeView+ depends on / blocked
 
Reported: 2011-12-01 17:12 UTC by Richard Marko
Modified: 2016-02-01 02:22 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
: 773215 (view as bug list)
Environment:
Last Closed: 2013-05-17 13:58:44 UTC
Type: ---


Attachments (Terms of Use)
ls showing truncated core dumps (394 bytes, text/plain)
2011-12-01 17:14 UTC, Richard Marko
no flags Details
ls showing truncated core dumps (569 bytes, text/plain)
2011-12-01 17:15 UTC, Richard Marko
no flags Details
ls when using core.%p (337 bytes, text/plain)
2011-12-01 17:17 UTC, Richard Marko
no flags Details
patch postgres to cause problem at issue (52 bytes, text/plain)
2011-12-01 17:18 UTC, Richard Marko
no flags Details
patch postgres to cause problem at issue (1.82 KB, text/plain)
2011-12-01 17:19 UTC, Richard Marko
no flags Details
patch to point signal_pending to reader task (1.36 KB, patch)
2011-12-02 01:10 UTC, Neil Horman
no flags Details | Diff
updated patch (1.65 KB, patch)
2011-12-02 13:59 UTC, Neil Horman
no flags Details | Diff
[patch] suppress the signals during the coredump (1.44 KB, patch)
2011-12-02 18:41 UTC, Oleg Nesterov
no flags Details | Diff
[patch v2] suppress the signals during the coredump (1.44 KB, patch)
2011-12-02 18:46 UTC, Oleg Nesterov
no flags Details | Diff

Description Richard Marko 2011-12-01 17:12:44 UTC
Description of problem:
When two or more crashes happen at the same time (within one second) their core dumps are truncated when handled with custom hook.

$ cat /proc/sys/kernel/core_pattern
|/tmp/zzz/simple.sh %s %c %p %u %g %t e

simple.sh:

#!/bin/sh
logger "$0: $3"
exec cat > /tmp/zzz/mycore.$3


Version-Release number of selected component (if applicable):
3.1.1-1.fc16.x86_64

How reproducible:
Always

Steps to Reproduce:
1. install postgresql-9.1rc1
2. patch it to produce concurrent crashes
3. use simple.sh as core_pattern
4. run make installcheck-parallel
  
Actual results:
BFD: Warning: /tmp/zzz/new/mycore.12516 is truncated: expected core file size >= 35057664, found: 15433728.

Expected results:
No truncated cores

Additional info:
 - This doesn't happen when using core.%p as core_pattern
 - I'm unable to reproduce the same thing with smaller binaries (sleep, top)

Comment 1 Richard Marko 2011-12-01 17:14:22 UTC
Created attachment 539296 [details]
ls showing truncated core dumps

Comment 2 Richard Marko 2011-12-01 17:15:55 UTC
Created attachment 539297 [details]
ls showing truncated core dumps

Comment 3 Richard Marko 2011-12-01 17:17:42 UTC
Created attachment 539298 [details]
ls when using core.%p

Comment 4 Richard Marko 2011-12-01 17:18:22 UTC
Created attachment 539299 [details]
patch postgres to cause problem at issue

Comment 5 Richard Marko 2011-12-01 17:19:17 UTC
Created attachment 539300 [details]
patch postgres to cause problem at issue

Comment 6 Richard Marko 2011-12-01 17:23:11 UTC
Further steps for reproducing with postgresql:

/usr/local/pgsql/bin/initdb -D $PGDATA
/usr/local/pgsql/bin/postgres -D $PGDATA > logfile 2>&1
make installcheck-parallel

Comment 8 Josh Boyer 2011-12-01 19:39:47 UTC
That script seems overly simple.  And I'm not sure why one would desire to use something like that anyway, but it's fairly immaterial to the problem I guess.

If you take a look here:

http://www.mail-archive.com/xorg-devel@lists.x.org/msg04573.html

you'll see that signals delivered to the helper process during dump can cause truncated core dumps.  Since the script isn't doing anything with signals at all, it's certainly plausible that a signal is pending and causing this.

You might try adding:

trap '' SIGHUP SIGALRM SIGSEGV SIGABRT SIGINT

etc to the top of the script to ignore the signals that can be ignored and see if that helps.

Comment 9 Josh Boyer 2011-12-01 20:02:59 UTC
Oleg and Neil, both of you have worked in the coredump and coredump helper code before (at least according to git).

Do either of you have any comments on the potential issues with using a pipe helper and getting truncated core files?  The "interrupted by signal" issue is pretty clear in the kernel code, but I don't know if there are other potential causes.

Comment 10 Tom Lane 2011-12-01 23:34:36 UTC
(In reply to comment #8)
> ... signals delivered to the helper process during dump can cause
> truncated core dumps.  Since the script isn't doing anything with signals at
> all, it's certainly plausible that a signal is pending and causing this.

Well, that's plausible I guess, but then the question becomes what's signaling the helper process and why.

It's entirely likely that Postgres would be attempting to send signals to the now-crashed process.  Is it possible that the kernel is delivering those to the coredump helper?  If so, I'd argue that that's a bug.

Comment 11 Neil Horman 2011-12-02 00:17:03 UTC
I can't be certain, but I think there is a bug in the wait_for_dump_helpers code in the do_coredump path.  We're checking signal_pending(current), but IIRC the intent is to make sure that the core dump helper is making forward progress but making sure the the SIGIO we send to it every loop iteration is getting handled.  The problem is that current points to the crashing process, not the reading process, so if the crashing process gets a signal delivered from somewhere, we drop out of that loop before the pipe reader is done doing its work.  I'll write a patch for this in the AM for you to test.

Comment 12 Neil Horman 2011-12-02 01:10:39 UTC
Created attachment 539460 [details]
patch to point signal_pending to reader task

Here you go, untested as of yet, but I'll build it in the morning.

Comment 13 Richard Marko 2011-12-02 12:34:22 UTC
(In reply to comment #8)
> You might try adding:
> 
> trap '' SIGHUP SIGALRM SIGSEGV SIGABRT SIGINT
> 
> etc to the top of the script to ignore the signals that can be ignored and see
> if that helps.

Trapping the signals doesn't help. Same results.

Comment 14 Neil Horman 2011-12-02 13:59:40 UTC
Created attachment 539650 [details]
updated patch

updated patch that fixes a missing hunk in the last version.  Build is running here:
http://koji.fedoraproject.org/koji/taskinfo?taskID=3557610

Please test and confirm if this fixes the problem for you.

Comment 15 Oleg Nesterov 2011-12-02 16:31:54 UTC
(In reply to comment #14)
>
> Created attachment 539650 [details]
> updated patch

Doesn't look correct. signal_pending(cp->reader_task) is not
what we want.

This is the known problem, should be fixed upstream. It is not
that trivial. See http://marc.info/?l=linux-kernel&m=131989970411759
and the whole thread.

Comment 16 Josh Boyer 2011-12-02 17:40:58 UTC
(In reply to comment #15)
> (In reply to comment #14)
> >
> > Created attachment 539650 [details]
> > updated patch
> 
> Doesn't look correct. signal_pending(cp->reader_task) is not
> what we want.
> 
> This is the known problem, should be fixed upstream. It is not
> that trivial. See http://marc.info/?l=linux-kernel&m=131989970411759
> and the whole thread.

By "should be fixed upstream" do you mean "it still needs to be fixed upstream"?  I looked upstream but don't see anything that stands out as having fixed this issue.

Oh, and the thread is rather short basically ending with the message you point out.  Was there anything further in another thread?

Comment 17 Neil Horman 2011-12-02 18:00:18 UTC
Oleg, why doesn't it look correct?  IIRC the signal_pending check was there to ensure that the SIGIO we send in the loop gets handled.  The only problem is that we're sending the signal to the reader task, but we're checking for pending on the writer task.

Comment 18 Oleg Nesterov 2011-12-02 18:02:02 UTC
(In reply to comment #16)
>
> > This is the known problem, should be fixed upstream. It is not
> > that trivial. See http://marc.info/?l=linux-kernel&m=131989970411759
> > and the whole thread.
> 
> By "should be fixed upstream" do you mean "it still needs to be fixed
> upstream"?

Yes. I'll try to fix this in 3.3.

> Was there anything further in another thread?

There was another short thread http://marc.info/?t=131983286300002

Plus this was discussed before... But I that message I point
accurately summarises the problems and the possible solutions.
I hope.

Comment 19 Oleg Nesterov 2011-12-02 18:10:36 UTC
(In reply to comment #17)
> IIRC the signal_pending check was there to
> ensure that the SIGIO we send in the loop gets handled.

No, no. From http://marc.info/?l=linux-kernel&m=131989970411759

     here are 2 reasons. if signal_pending() == T then:

	- pipe_wait() is pointless, it won't block. We do not want
	  a busywait loop.

	- And probably even wait_for_dump_helpers() is pointless,
	  it is quite possible that pipe_write() already failed
	  before and the reader doesn't know this.

> The only problem is
> that we're sending the signal to the reader task, but we're checking for
> pending on the writer task.

The reader should take care of signals itself. How (and why?) the
kernel should help?

I'll try to make the debugging patch today...

Comment 20 Oleg Nesterov 2011-12-02 18:20:45 UTC
Forgot to mention...

(In reply to comment #17)
>
> ensure that the SIGIO we send in the loop gets handled.

Please note that we do not actually send SIGIO.

Of course, unless the reader does F_SETOWN/F_SETSIG.

Comment 21 Oleg Nesterov 2011-12-02 18:41:46 UTC
Created attachment 539731 [details]
[patch] suppress the signals during the coredump

Suppress the signals during the coredump.

This is the wrong/incomplete hack, only for testing.

Comment 22 Oleg Nesterov 2011-12-02 18:46:30 UTC
Created attachment 539733 [details]
[patch v2] suppress the signals during the coredump

Suppress the signals during the coredump.

This is the wrong/incomplete hack, only for testing.

v2: check signal_pending() later, after wait_for...

Comment 23 Josh Boyer 2011-12-05 16:34:37 UTC
(In reply to comment #22)
> Created attachment 539733 [details]
> [patch v2] suppress the signals during the coredump
> 
> Suppress the signals during the coredump.
> 
> This is the wrong/incomplete hack, only for testing.
> 
> v2: check signal_pending() later, after wait_for...

Richard, do you need a scratch kernel built to test with this, or can you build that yourself?

Comment 24 Richard Marko 2011-12-05 17:16:40 UTC
(In reply to comment #23)
> Richard, do you need a scratch kernel built to test with this, or can you build
> that yourself?

Please build it for me. Thanks.

Comment 25 Josh Boyer 2011-12-05 17:36:23 UTC
(In reply to comment #24)
> (In reply to comment #23)
> > Richard, do you need a scratch kernel built to test with this, or can you build
> > that yourself?
> 
> Please build it for me. Thanks.

http://koji.fedoraproject.org/koji/taskinfo?taskID=3564158

When the above completes, please test.

Comment 26 Richard Marko 2011-12-08 10:41:48 UTC
(In reply to comment #25)
> http://koji.fedoraproject.org/koji/taskinfo?taskID=3564158
> 
> When the above completes, please test.

This build fixes the issue.

Comment 27 Neil Horman 2011-12-08 12:05:02 UTC
Ok, Oleg, it fixes the problem, how do you want to handle it from here?  Since you've been participating more on this upstream, I'll just reassign this to you.

Comment 28 Oleg Nesterov 2011-12-08 17:18:33 UTC
(In reply to comment #27)
> Ok, Oleg, it fixes the problem, how do you want to handle it from here?

As I said, I am going to send the patch(es) upstream first. Unfortunately,
not immediately, I am a bit busy now. I hope I can do this before the
next 3.3 merge window.

> Since
> you've been participating more on this upstream, I'll just reassign this to
> you.

OK.

Comment 29 Josh Boyer 2011-12-08 17:38:11 UTC
Since this is still broken upstream (and has been for a while), let's move this bug to rawhide for now.  The fix will show up there first and we can look at backporting to various stable releases after that.

Comment 30 Josh Boyer 2012-02-02 17:27:57 UTC
(In reply to comment #28)
> (In reply to comment #27)
> > Ok, Oleg, it fixes the problem, how do you want to handle it from here?
> 
> As I said, I am going to send the patch(es) upstream first. Unfortunately,
> not immediately, I am a bit busy now. I hope I can do this before the
> next 3.3 merge window.

Out of curiosity, did this get fixed for 3.3?

Comment 31 Josh Boyer 2012-02-02 17:29:23 UTC
*** Bug 678807 has been marked as a duplicate of this bug. ***

Comment 32 Oleg Nesterov 2012-02-03 17:34:08 UTC
(In reply to comment #30)
> (In reply to comment #28)
> > (In reply to comment #27)
> > > Ok, Oleg, it fixes the problem, how do you want to handle it from here?
> > 
> > As I said, I am going to send the patch(es) upstream first. Unfortunately,
> > not immediately, I am a bit busy now. I hope I can do this before the
> > next 3.3 merge window.
> 
> Out of curiosity, did this get fixed for 3.3?

No, sorry :/ still on my TODO list.

Comment 33 Josh Boyer 2012-02-24 20:59:56 UTC
Moving this to ASSIGNED so it doesn't keep showing up as NEW.

Comment 37 Jeremy Fitzhardinge 2012-12-21 01:47:45 UTC
Has there been any work on this?  We're running into it.

Thanks, J

Comment 38 Josh Boyer 2013-05-16 20:24:10 UTC
Oleg, I think this should be fixed with 3.10-rc1, correct?  With the following commits?

dc7ee2aac830e5423f41de87d50441f138f648da coredump: change wait_for_dump_helpers() to use wait_event_interruptible()

079148b919d0c58b796f9ae98bdb53028dbcd5e7 coredump: factor out the setting of PF_DUMPCORE

528f827ee0bb508af5c9466562f474675540473e coredump: introduce dump_interrupted()

acdedd99b0f3bff9b4bb2103a6b1268c03d1f963 coredump: sanitize the setting of signal->group_exit_code

6cd8f0acae3420afce37bf51a9ff8c2c20342af5 coredump: ensure that SIGKILL always kills the dumping thread

403bad72b67d8b3f5a0240af5023adfa48132a65 coredump: only SIGKILL should interrupt the coredumping task

Comment 39 Oleg Nesterov 2013-05-17 12:13:30 UTC
Yes.

See also https://bugzilla.redhat.com/show_bug.cgi?id=773215#c10

We do not need all these commits if we only want to protect the
coredumping from non-SIGKILL signals.

Probably this bug should bug can be closed? (I'll backport these
changes for rhel).

Comment 40 Josh Boyer 2013-05-17 13:58:44 UTC
(In reply to comment #39)
> Yes.
> 
> See also https://bugzilla.redhat.com/show_bug.cgi?id=773215#c10
> 
> We do not need all these commits if we only want to protect the
> coredumping from non-SIGKILL signals.

We have them all in rawhide because we just follow Linus' tree ;).

> Probably this bug should bug can be closed? (I'll backport these
> changes for rhel).

Yep, agreed.  Thanks!


Note You need to log in before you can comment on or make changes to this bug.