469684 – Signals not delivered when tracing system calls

Bug 469684 - Signals not delivered when tracing system calls

Summary: Signals not delivered when tracing system calls

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	9
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Roland McGrath
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2008-11-03 14:42 UTC by Magnus Vesterlund
Modified:	2009-07-14 17:16 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2009-07-14 17:16:10 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Test case (1.20 KB, text/x-csrc) 2008-11-03 14:42 UTC, Magnus Vesterlund	no flags	Details
signal-loss.c testcase (4.83 KB, text/plain) 2008-11-18 14:56 UTC, Denys Vlasenko	no flags	Details
Testcase without looping. (7.18 KB, text/plain) 2008-11-20 00:30 UTC, Jan Kratochvil	no flags	Details
Testcase without looping #2 (6.84 KB, text/plain) 2008-11-20 16:12 UTC, Denys Vlasenko	no flags	Details
[PATCH] fix utrace_report_syscall_entry()->recalc_sigpending() logic (1.46 KB, patch) 2008-11-30 22:57 UTC, Oleg Nesterov	no flags	Details \| Diff
View All

Description Magnus Vesterlund 2008-11-03 14:42:41 UTC

Created attachment 322306 [details]
Test case

Description of problem:

When I trace system calls using ptrace on the Fedora kernel signals remain pending instead of being delivered to the traced process.

The attached program spawns a child process that sleeps forever in a futex wait. The parent process traces system calls in the child process while sending SIGINT signals to it. However, after getting a notification about a system call entry or exit (SYS_restart_syscall) the program hangs.

...
waitpid: 3439 -> 27f
waitpid: 3439 -> 27f
waitpid: 3439 -> 27f
waitpid: 3439 -> 857f

Looking at the /proc/<pid>/status file of the traced process the SIGINT signal sent to the process remains pending for no apparent reason instead of being delivered to the process:

$ cat /proc/3439/status
...
SigQ:   2/16372
SigPnd: 0000000000000000
ShdPnd: 0000000000000002
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000000000000
...

This does not happen if I replace PTRACE_SYSCALL with PTRACE_CONT, and it does not happen with a vanilla Linux kernel.


Version-Release number of selected component (if applicable):
kernel-2.6.26.6-79.fc9.x86_64
The FC10 beta kernels also have the same problem.

How reproducible:
Every time.

Steps to Reproduce:
1. Compile and run the attached program.

Actual results:
Program hangs with undelivered SIGINT signal to the traced process.

Expected results:
Continuous delivery of SIGINT signals to the traced process.

Comment 1 Magnus Vesterlund 2008-11-04 10:04:29 UTC

kernel-2.6.26.6-49.fc8.x86_64 also has this problem.

Comment 2 Denys Vlasenko 2008-11-18 14:56:10 UTC

Looks like a bug. Created "signal-loss" testcase and added it to

http://sourceware.org/systemtap/wiki/utrace/tests

Futexes are not essential for this bug, this:

  if (child == 0)
    while (1)
      pause ();

will also work.

The bug here is that when we enter pause(), kernel doesn't notice that there is a pending SIGINT, and it never returns from the syscall, as if signal is not there. Vanilla (non-utrace) kernels return right away, report syscall exit (SIGTRAP | 0x80) and then report SIGINT being delivered to the tracee.

Will attach testcase now. Build it with "gcc -Os -DTALKATIVE signal-loss.c".

Comment 3 Denys Vlasenko 2008-11-18 14:56:53 UTC

Created attachment 323908 [details]
signal-loss.c testcase

Comment 4 Jan Kratochvil 2008-11-18 15:43:23 UTC

IMO the testcase is racy.

You call there PTRACE_SYSCALL without waiting till the inferior stops.  At that moment you also asynchronously do kill(2) there.  kill(2) may come arbitrarily before or after the inferior stops.

In the case you kill(2) it after the inferior stopped due to PTRACE_SYSCALL two different signals are generated at that moment (SIGINT and SIGTRAP|0x80).  You should pick up both signals by waitpid(2) otherwise one is left pending and it all gets tricky.

Not sure how the testcase should be fixed as it is racy and after removing the race there is not much to test.

Tried to finally write down some notes on the ptrace(2) functionality based on the infrormation provided by Roland McGrath:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/PTRACE?cvsroot=systemtap
It is only a draft, there may be mistakes and it definitely should address more parts of the ptrace(2) functionality, please point out if anything is missing there.

Comment 5 Denys Vlasenko 2008-11-19 13:49:38 UTC

> You call there PTRACE_SYSCALL without waiting till the inferior stops.

I don't think so. As far as I can see, in the testcase ptrace(PTRACE_SYSCALL) is always called after waitpid().

> At that moment you also asynchronously do kill(2) there.  kill(2) may come arbitrarily before or after the inferior stops.

True. This is done deliberately. When you strace a program and it is being signaled by another process, you certainly can't control _when_ it will be signaled (when it is stopped or not).

> In the case you kill(2) it after the inferior stopped due to PTRACE_SYSCALL two
different signals are generated at that moment (SIGINT and SIGTRAP|0x80).  You
should pick up both signals by waitpid(2) otherwise one is left pending and it
all gets tricky.

Yes. And testcase does exactly that. It waitpid's and collects SIGINTs until it finally sees SIGTRAP|0x80. As you correctly point out, it's not determinable how many SIGINTs we will get before we see SIGTRAP|0x80. I saw from none to ~30.

What testcase is testing for is: as soon as we got SIGTRAP|0x80, the very next stop should be SIGTRAP|0x80 too, and next one should be SIGINT.

On buggy kernels, second waitpid (which has to return second SIGTRAP|0x80) does not return, it blocks.

Comment 6 Jan Kratochvil 2008-11-20 00:30:14 UTC

Created attachment 324123 [details]
Testcase without looping.

Thanks for the explanation, understood now.

Proposing this testcase instead - it does not need the loop there and it does not have any undeterministic behavior (ptrace vs. kill races, despite controlled in the testcases before).

Comment 7 Denys Vlasenko 2008-11-20 15:39:28 UTC

I verified that it still fails on Fedora.
But what you do with SA_RESTART needs better explanation. It took me circa 10 minutes to grok it.

  /* Send signal. It will not kill the child because it is being traced,
   * and all signals are reported to tracer first. We never pass back
   * the signal through PTRACE_SYSCALL, thus it never reaches the child.
>   * However, it must make child interrupt the pause() and report SIGINT
>   * to us.
^^^^^^^ delete this sentence, it's superfluous now
   * Still it should immediately report back to us SIGINT as would
   * happen if we would not be inside the pause(2) syscall by the first
   * PTRACE_SYSCALL above.
   */
  rc = kill (child, SIGINT);
  assert (rc == 0);



  /* Prepare an expected one-second timeout as SIGINT should get just enqueued
     when inside a syscall by PTRACE_SYSCALL.  We must clear SA_RESTARt to
     abort waitpid(2) by the SIGALRM handler invocation.  */

I did not understand the above even though I read it several times. I propose something like:

  /* At this point, waitpid should block, because tracee is stopped.
   * Sending signals to it should not wake it up.
   * Check that this is true. (We install NOP handler for SIGALRM
   * and make sure syscall restart is disabled (SA_RESTART is off).
   * waitpid will block for 1 second and SIGALRM will EINTR it). */

  alarm (1);
  memset (&act, 0, sizeof act);
  act.sa_handler = handler_nop;
  /* ACT.SA_FLAGS must have SA_RESTARt unset.  */
  rc = sigemptyset (&act.sa_mask);           <<-- superfluous
  assert (rc == 0);                          <<--
  rc = sigaction (SIGALRM, &act, &oldact);
  assert (rc == 0);
  assert (oldact.sa_handler == handler_fail);



More readable alternative is to setup SIGALRM handler with !SA_RESTART in main (to avoid clutter in testcase code), have this in the handler:

static volatile int timeout_is_a_known_bug;
static volatile int timeout_is_ok;

static void
handler_fail (int signo)
{
  if (signo == SIGALRM) {
    if (timeout_is_ok) return;
    if (timeout_is_a_known_bug) exit (1);
  }
...

Then you can replace entire sigaction dance with just 

  alarm (1)
  timeout_is_ok = 1;
  ...
  ...
  timeout_is_ok = 0;

Comment 8 Jan Kratochvil 2008-11-20 16:08:10 UTC

(In reply to comment #7)
>   memset (&act, 0, sizeof act);
>   act.sa_handler = handler_nop;
>   /* ACT.SA_FLAGS must have SA_RESTARt unset.  */
>   rc = sigemptyset (&act.sa_mask);           <<-- superfluous
>   assert (rc == 0);                          <<--

This is not superfluous as it would violate POSIX if it would be missing there:
http://www.opengroup.org/onlinepubs/009695399/functions/sigemptyset.html
# For such reasons, either sigemptyset() or sigfillset() must be called prior
# to any other use of the signal set, even if such use is read-only

Valid sigset_t may require to have some size field set etc., it is just
a coincidence the current GNU/Linux sigemptyset() implementation just clears it.


The other comment/code changes I find OK.  Also the Comment 5 in the code is obsolete now.  I primarily wanted to remove the PTRACE_SYSCALL vs. kill(2) race there (no matter if it had an effect on the actual result or not).

Comment 9 Denys Vlasenko 2008-11-20 16:12:16 UTC

Created attachment 324191 [details]
Testcase without looping #2

Comment 10 Denys Vlasenko 2008-11-28 14:11:36 UTC

I updated testcase in the utrace_tests cvs. Thanks Jan.

Comment 11 Oleg Nesterov 2008-11-30 22:52:10 UTC

Damn if I know what utrace does, just a wild guess...

The problem is, if we send SIGINT to TASK_TRACED task, the
signal is queued but TIF_SIGPENDING is not set (this is OK).

Then, we are waking the tracee by PTRACE_SYSCALL, it completes
the report_syscall logic, and re-enters sys_pause due to
-ERESTARTNOHAND. And this is not OK: it sleeps in TASK_INTERRUPTIBLE
with SIGINT pending and without signal_pending().

The subsequent kill(INT) can't help due to legacy_queue() check
in send_signal() (we can't queue INT twice).

int main(void)
{
        long res;
        int status;

        pid_t pid = fork();
        if (!pid) {
                pause();
                exit(0);
        }

        res = ptrace(PTRACE_ATTACH, pid, NULL, NULL);
        assert(res == 0);

        while (1) {
                res = waitpid(-1, &status, __WALL);
                assert(WIFSTOPPED(status));
                printf("waitpid: %ld -> %x %d\n", res, status, WSTOPSIG(status));

                res = ptrace(PTRACE_SYSCALL, pid, NULL, NULL);
                assert(res == 0);

                sleep(1);
                kill(pid, SIGINT);
        }
}

this program hangs after it recieves the SIGTRAP notification,
output:

   waitpid: 5137 -> 137f 19
   waitpid: 5137 -> 57f 5

I think this confirms the theory.

I'll send the patch in a minute. utrace_report_syscall_entry()
has a lenghty comment (which I can't parse), but the code
looks buggy: afaics we should recalc_sigpending() in any case.

Comment 12 Oleg Nesterov 2008-11-30 22:57:31 UTC

Created attachment 325158 [details]
[PATCH] fix utrace_report_syscall_entry()->recalc_sigpending() logic

Comment 15 Bug Zapper 2009-06-10 03:09:49 UTC

This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '9'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 9's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 9 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 16 Bug Zapper 2009-07-14 17:16:10 UTC

Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.