Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1535670

Summary:

Clarify atomicity semantics of sigpause/sigsuspend in glibc.

Product:

Red Hat Enterprise Linux 8

Reporter:

Steven Emmerson <emmerson.steven>

Component:

glibc

Assignee:

glibc team <glibc-bugzilla>

Status:

CLOSED UPSTREAM

QA Contact:

qe-baseos-tools-bugs

Severity:

medium

Docs Contact:

Priority:

unspecified

Version:

8.2

CC:

ashankar, codonell, dj, emmerson.steven, fweimer, mnewsome, pfrankli

Target Milestone:

Keywords:

Triaged

Target Release:

8.2

Flags:

pm-rhel: mirror+

Hardware:

x86_64

OS:

Unspecified

See Also:

http://austingroupbugs.net/view.php?id=1201

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2020-02-24 15:01:21 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Program to demonstrate the problem	none
Reproducer using sigsuspend	none

Description Steven Emmerson 2018-01-17 21:01:35 UTC

Description of problem:
The fork() function is incorrectly clearing the set of pending signals for the child process *after* a signal has been sent to the child process; consequently, the signal is never delivered.


Version-Release number of selected component (if applicable):
glibc 2.17


How reproducible:
Always (on my system) if the `usleep(100000)` is commented-out.

Steps to Reproduce:
1. Build and execute the attached program.
2. Watch it hang.

Actual results:
The program hangs due to the following sequence of events:
    1. The child process is fork()ed;
    2. The SIGTERM is sent to the child process; and
    3. The set of pending signals for the child process is cleared

Expected results:
Program should immediately exit with status 0.

Additional info:
According to the POSIX standard at <http://pubs.opengroup.org/onlinepubs/9699919799/>, the following sequence of events should occur:
    1. The child process is forked();
    2. The set of pending signals for the child process is cleared; and
    3. The SIGTERM is sent to the child process.

In other words, the fork() should be atomic.

My platform is a 6 processor VMware Workstation 14 Player VM running CentOS 7.4.1708 with gcc 4.8.5 and glibc-2.17-196.el7_4.2.x86_64.

Comment 2 Carlos O'Donell 2018-01-17 22:33:52 UTC

Steven,

Thank you for the bug report.

Could you please attach the example program?

Comment 3 Steven Emmerson 2018-01-17 22:41:35 UTC

Created attachment 1382626 [details]
Program to demonstrate the problem

Forgot to attach this. Sorry.

Comment 5 Florian Weimer 2018-08-21 19:12:31 UTC

Created attachment 1477667 [details]
Reproducer using sigsuspend

Isn't this a bug in the implementation of sigpause?  It unblocks the signal (which causes the handler to run if the signal is pending), and then suspends the calling thread.

This also affects sigsuspend, and this is implemented in glibc as a direct system call, but the kernel (circa 4.18 sources) appears to have the same issue: first the signal is unblocked (via set_current_blocked), and then the waiting starts (the loop in sigsuspend).

In strace, it looks like this:

8311  clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fdc7101b790) = 8312
8311  rt_sigaction(SIGTERM, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fdc70aa26e0}, {sa_handler=0x400800, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fdc70aa26e0}, 8) = 0
8311  rt_sigprocmask(SIG_UNBLOCK, [TERM], [], 8) = 0
8311  rt_sigprocmask(SIG_SETMASK, NULL, [], 8) = 0
8311  rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
8311  kill(8312, SIGTERM)               = 0
8311  wait4(-1,  <unfinished ...>
8312  --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=8311, si_uid=1000} ---
8312  rt_sigreturn({mask=[]})           = 0
8312  rt_sigsuspend(~[TERM RTMIN RT_1], 8

This race does not happen all the time, something like this is needed:

$ while strace -o log -f ./bz1535670-sigsuspend ; do date; done

I think we need a kernel fix first, and then we can debate whether to fix the obsolete sigpause function.

Comment 6 Florian Weimer 2018-08-22 14:12:46 UTC

I submitted a clarification request to the Austin Group:

http://austingroupbugs.net/view.php?id=1201

I think that both kernel and glibc are conforming to the POSIX description as it is worded right now because the description contains the same race condition.  It is not entirely clear to me if that is by accident or deliberate, hence the need for clarification.

Comment 7 Carlos O'Donell 2018-08-23 00:57:44 UTC

(In reply to Florian Weimer from comment #6)
> I submitted a clarification request to the Austin Group:
> 
> http://austingroupbugs.net/view.php?id=1201
> 
> I think that both kernel and glibc are conforming to the POSIX description
> as it is worded right now because the description contains the same race
> condition.  It is not entirely clear to me if that is by accident or
> deliberate, hence the need for clarification.

I think that submitting a clarification to the Austin Group to get an official statement for POSIX is the right way forward.

I agree that as of *today* the API as implemented is conforming.

Historically it has been the case that sigpause and sigsuspend do not atomically alter the signal mask and wait, therefore the parent is forced to follow a design that requires sending multiple signals until it knows the child has received at least one of those signals (this is not the same problem as "unreliable semantics" that were present in the original System-V). The alternative design is to have the child's signal handler finish by exiting (since a sent signal is always caught at least once).

This design must establish a happens-before relationship between the child exiting and the parent calling wait() to capture the child's exit status. If the happens-before relationship is not established the parent may hang waiting for a child that is waiting for a signal.

As it stands the test case is broken.

Fixes include:

(a) Make the signal handler do all that is required and then exit.

static void 
handleSigTerm(int sig)
{
  /* Do required shutdown... */
  exit (0);
}

or

(b) Have the parent loop.

I wrote this up quickly, just to make my point, but I haven't vetted this with anyone from my team yet.

Looping involves a synchronized state between parent and child to know when the child reaches the exit() state. Transitions are achieved with signals. You may need 1 signal to transition, or more than 1 signal. There is the opportunity for states to stack in the pipe, but eventually the parent will confirm the child has reached the exit state.

The following example uses only pipe/write/read/close as additional functions, but you might use sigwaitinfo() in the child thread to do something like this safely with an even loop.

#include <signal.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/wait.h>
#include <stdio.h>
#include <errno.h>

int output[2];
int input[2];

static void
handleSigTerm(int sig)
{
  /* Tell parent we handled the signal.  */
  write (input[1], "x", 1);
}

int
main(void)
{
  ssize_t res;
  char buf[1];
  sighold(SIGTERM);
  sigset(SIGTERM, handleSigTerm);
  pipe(input);
  pipe(output);
  pid_t forkPid = fork();
  if (forkPid == 0)
    {
      char buf[1];
      close (input[0]);
      close (output[1]);
      /* Child process.  */
      sigpause(SIGTERM);
      /* Tell parent we caught the signal.  */
      read (output[0], buf, 1);
      write (input[1], &buf[0], 1);
      close (output[0]);
      close (input[1]);
      exit(0);
    }
  sigset(SIGTERM, SIG_DFL);
  sigrelse(SIGTERM);
  do
    {
      kill(forkPid, SIGTERM);
      res = write (output[1], "1", 1);
      res = read (input[0], buf, 1);
      printf ("read() = %c [%zd]\n", buf[0], res);
    }
  while (buf[0] != '1');
  /* The read of '1' establishes a happens-before, that
     ensures the child is going to exit.  */
  wait(NULL);
  return 0;
}

Notes:
- https://stackoverflow.com/questions/40592066/sigsuspend-vs-additional-signals-delivered-during-handler-execution/40900621#40900621

Comment 8 Florian Weimer 2018-08-23 05:47:45 UTC

(In reply to Carlos O'Donell from comment #7)

> (a) Make the signal handler do all that is required and then exit.
> 
> static void 
> handleSigTerm(int sig)
> {
>   /* Do required shutdown... */
>   exit (0);
> }

Note that this should be “_exit (0);”, otherwise this will flush stdio streams and run atexit handlers, which can corrupt the state of the parent process.

Comment 9 Steven Emmerson 2018-08-23 15:12:39 UTC

(In reply to Florian Weimer from comment #6)

>I think that both kernel and glibc are conforming to the POSIX description
> as it is worded right now because the description contains the same race
> condition.

Forgive me, but I don't see how this statement from the POSIX standard for fork(2):

    The set of signals pending for the child process shall be initialized to the empty set.

can be interpreted to mean that the child process is allowed to receive a signal before its fork(2) returns.

I'll be very interested in the response from the Austin Group.

Comment 10 Florian Weimer 2018-08-23 15:24:09 UTC

(In reply to Steven Emmerson from comment #9)
> (In reply to Florian Weimer from comment #6)
> 
> >I think that both kernel and glibc are conforming to the POSIX description
> > as it is worded right now because the description contains the same race
> > condition.
> 
> Forgive me, but I don't see how this statement from the POSIX standard for
> fork(2):
> 
>     The set of signals pending for the child process shall be initialized to
> the empty set.
> 
> can be interpreted to mean that the child process is allowed to receive a
> signal before its fork(2) returns.

The above simply states that the set of pending signals is *not* inherited from the parent process.  In the producer, the parent process does not have any pending signals, so this does not apply here.

Comment 11 Steven Emmerson 2018-08-23 16:11:51 UTC

(In reply to Florian Weimer from comment #10)
> >The set of signals pending for the child process shall be initialized to
> >the empty set.

> The above simply states that the set of pending signals is *not* inherited
> from the parent process.

I agree that the statement means that the set of pending signals is not inherited from the parent process; however, I interpret the statement to *also* mean that the child process *can rely on the set of pending signals being completely cleared*.

> In the producer, the parent process does not have
> any pending signals, so this does not apply here.

I don't see how the set of pending signals in the parent process has any bearing on the set of pending signals in the child process if the child's pending signals are always cleared. In other words, the parent's pending signals are irrelevant to the child.

Comment 12 Florian Weimer 2018-08-23 17:24:54 UTC

(In reply to Steven Emmerson from comment #11)
> (In reply to Florian Weimer from comment #10)
> > >The set of signals pending for the child process shall be initialized to
> > >the empty set.
> 
> > The above simply states that the set of pending signals is *not* inherited
> > from the parent process.
> 
> I agree that the statement means that the set of pending signals is not
> inherited from the parent process; however, I interpret the statement to
> *also* mean that the child process *can rely on the set of pending signals
> being completely cleared*.

Sure, but this doesn't make a difference here.  What appears to happen is that the signal is unblocked, one signal is delivered (by running the handler), and then the process waits for another signal to arrive, which never happens in the reproducer.  No pending signals are involved.

Comment 13 Steven Emmerson 2018-08-23 17:42:45 UTC

(In reply to Florian Weimer from comment #12)
> Sure, but this doesn't make a difference here.  What appears to happen is
> that the signal is unblocked, one signal is delivered (by running the
> handler), and then the process waits for another signal to arrive, which
> never happens in the reproducer.  No pending signals are involved.

How can a signal be delivered because it was unblocked and yet not have been pending?

Comment 14 Florian Weimer 2018-08-24 09:41:19 UTC

(In reply to Steven Emmerson from comment #13)
> (In reply to Florian Weimer from comment #12)
> > Sure, but this doesn't make a difference here.  What appears to happen is
> > that the signal is unblocked, one signal is delivered (by running the
> > handler), and then the process waits for another signal to arrive, which
> > never happens in the reproducer.  No pending signals are involved.
> 
> How can a signal be delivered because it was unblocked and yet not have been
> pending?

If the signal is unblocked, the pending state only exists for exploratory purposes as far as I can see.  The signal is delivered immediately.  POSIX says this:

“During the time between the generation of a signal and its delivery or acceptance, the signal is said to be "pending". Ordinarily, this interval cannot be detected by an application. However, a signal can be "blocked" from delivery to a thread.”

Comment 15 Steven Emmerson 2018-08-24 14:23:30 UTC

(In reply to Florian Weimer from comment #14)

So, in the fork(2) function, the signal was blocked, then it was delivered, and the standard says that all pending signals are cleared.

And you don't see anything wrong with that.

I give up.

Comment 16 Florian Weimer 2018-08-24 14:39:38 UTC

(In reply to Steven Emmerson from comment #15)
> (In reply to Florian Weimer from comment #14)
> 
> So, in the fork(2) function, the signal was blocked, then it was delivered,
> and the standard says that all pending signals are cleared.

Let me try again:

The signal necessarily arrives after the fork because it is targeted at the child process.  It is delivered when it is unblocked as the first phase of the sigpause call (or it arrives after the unblocking; the difference is not observable).  Then the second phase of the sigpause call begins, the waiting phase.  At this point, there is no pending signal, so sigpause blocks.

With the usleep commented out, the signal is sent always during the second phase of the sigpause call, at which it will observe that signal is delivered and return.

I don't think the current non-atomic behavior, as implemented for sigpause in glibc and for sigsuspend in the kernel, is useful, but the POSIX wording allows it.

Comment 17 Steven Emmerson 2018-08-24 15:43:36 UTC

(In reply to Florian Weimer from comment #16)
> The signal necessarily arrives after the fork because it is targeted at the
> child process.  It is delivered when it is unblocked as the first phase of
> the sigpause call (or it arrives after the unblocking; the difference is not
> observable).  Then the second phase of the sigpause call begins, the waiting
> phase.  At this point, there is no pending signal, so sigpause blocks.

How do you reconcile this sequence with the standard saying that, regarding fork(), "The set of signals pending for the child process shall be initialized to the empty set.".

If your sequence is valid, then the standard shouldn't have mentioned this at all because it makes no difference if pending signals are allowed to be delivered.

Comment 18 Florian Weimer 2018-08-24 15:49:42 UTC

(In reply to Steven Emmerson from comment #17)
> (In reply to Florian Weimer from comment #16)
> > The signal necessarily arrives after the fork because it is targeted at the
> > child process.  It is delivered when it is unblocked as the first phase of
> > the sigpause call (or it arrives after the unblocking; the difference is not
> > observable).  Then the second phase of the sigpause call begins, the waiting
> > phase.  At this point, there is no pending signal, so sigpause blocks.
> 
> How do you reconcile this sequence with the standard saying that, regarding
> fork(), "The set of signals pending for the child process shall be
> initialized to the empty set.".
> 
> If your sequence is valid, then the standard shouldn't have mentioned this
> at all because it makes no difference if pending signals are allowed to be
> delivered.

The quoted sentence requires that that the set of pending signals is not copied from the parent process, unlike other parts of the process state.

In the execution sequence, there are no pending signals right after the fork, as required.  But then the parent process sends a signal, and it becomes pending in the subprocess (either before or after the unblock in sigpause).  This signal is delivered because sigpause unblocks the signal number.  Due to the non-atomic nature of sigpause, the waiting begins after the delivery, but then the delivered signal is gone, so that sigpause never returns.

Comment 19 Steven Emmerson 2018-08-24 17:05:40 UTC

(In reply to Florian Weimer from comment #18)
> This signal is delivered because sigpause unblocks the signal
> number.  Due to the non-atomic nature of sigpause, the waiting begins after
> the delivery, but then the delivered signal is gone, so that sigpause never
> returns.

Ah! So the problem is sigpause(2). I didn't realize that. My bad.

I modified the test program to use modern functions (e.g, sigsuspend(2) instead of sigpause(2)) and it worked.

Comment 20 Florian Weimer 2018-08-24 17:33:22 UTC

(In reply to Steven Emmerson from comment #19)
> (In reply to Florian Weimer from comment #18)
> > This signal is delivered because sigpause unblocks the signal
> > number.  Due to the non-atomic nature of sigpause, the waiting begins after
> > the delivery, but then the delivered signal is gone, so that sigpause never
> > returns.
> 
> Ah! So the problem is sigpause(2). I didn't realize that. My bad.

Oh.  Yes, it's definitely not a problem in fork.

> I modified the test program to use modern functions (e.g, sigsuspend(2)
> instead of sigpause(2)) and it worked.

Are you sure?  It seemed to me that it is just harder to trigger the race.

Comment 21 Carlos O'Donell 2018-08-24 19:25:22 UTC

(In reply to Florian Weimer from comment #20)
> (In reply to Steven Emmerson from comment #19)
> > (In reply to Florian Weimer from comment #18)
> > > This signal is delivered because sigpause unblocks the signal
> > > number.  Due to the non-atomic nature of sigpause, the waiting begins after
> > > the delivery, but then the delivered signal is gone, so that sigpause never
> > > returns.
> > 
> > Ah! So the problem is sigpause(2). I didn't realize that. My bad.
> 
> Oh.  Yes, it's definitely not a problem in fork.
> 
> > I modified the test program to use modern functions (e.g, sigsuspend(2)
> > instead of sigpause(2)) and it worked.
> 
> Are you sure?  It seemed to me that it is just harder to trigger the race.

Yes. So I just confirmed this with a discussion with other colleagues.

In a single-threaded process the process will enter the kernel and in an uninterruptable fashion it will unblock the signal, and then wait.

Signals in the kernel are only ever handled as the task is returning from a syscall. During the return to userspace the kernel checks for a signal to deliver and then calls into do_signal() which eventually calls into the per-arch signal delivery mechanism.

Therefore an external process (parent) sending a signal to the child cannot interrupt the childs unmask+suspend. Only the child itself returning from a syscall can be called upon to handle a signal for itself.

Therefore the kernel is atomically doing unmask+suspend, and so sigsuspend() which is a direct syscall should work to solve the problem.

I'm sorry I didn't come up with this earlier. It took a while to reach out to some kernel people I knew to confirm this.

Comment 22 Carlos O'Donell 2018-08-24 19:26:51 UTC

Should we retitle this to "Clarify atomicity semantics for sigpause in glibc?" Since it seems the kernel is atomic?

Comment 23 Florian Weimer 2018-08-24 19:56:07 UTC

(In reply to Carlos O'Donell from comment #21)

> In a single-threaded process the process will enter the kernel and in an
> uninterruptable fashion it will unblock the signal, and then wait.

Ah.  I still see a hang, though.  But the reason for that is that the original reproducer and my variant use sigset.  sigset unblocks the signal, which is why it is never blocked in the subprocess, so it does not matter if the glibc function has a race or not because the race is inherent to the reproducer.  Well, it helped us to find a problem in the sigpause implementation.

(In reply to Carlos O'Donell from comment #22)
> Should we retitle this to "Clarify atomicity semantics for sigpause in
> glibc?" Since it seems the kernel is atomic?

I think we can just remove the sigprocmask calls from the sigpause implementation and call sigsuspend directly in the sigpause implementation.  I think sigsuspend already does what we are trying to achieve with sigprocmask.  We can make this change upstream and close this bug as NOTABUG, I think.

Comment 24 Steven Emmerson 2018-08-24 20:02:18 UTC

(In reply to Florian Weimer from comment #20)
> Are you sure?  It seemed to me that it is just harder to trigger the race.

I tested it a few thousand times and it always worked.

Here's the code:

// This file tests the atomicity of sigpause(2)
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>

static void handleSigTerm(int sig)
{
    puts("handleSigTerm() called"); // Commenting-out made no difference
}

#define USE_OBSOLETE_FUNCS 0

int main()
{
#if USE_OBSOLETE_FUNCS
    sighold(SIGTERM);
#else
    sigset_t mask;
    sigemptyset(&mask);
    sigaddset(&mask, SIGTERM);
    sigprocmask(SIG_BLOCK, &mask, NULL);
#endif
#if USE_OBSOLETE_FUNCS
    sigset(SIGTERM, handleSigTerm);
#else
    struct sigaction sigact = {};
    sigact.sa_handler = handleSigTerm;
    sigaction(SIGTERM, &sigact, NULL);
#endif
    pid_t forkPid = fork();
    if (forkPid == 0) {
        // Child process
#if USE_OBSOLETE_FUNCS
        sigpause(SIGTERM);
#else
        sigdelset(&mask, SIGTERM);
        sigsuspend(&mask);
#endif
        exit(0);
    }
#if USE_OBSOLETE_FUNCS
    //usleep(100000); // Corrects non-atomicity of sigpause()
#endif
    kill(forkPid, SIGTERM);
    wait(NULL); // Hangs here if usleep() commented-out
    return 0;
}

Comment 25 Steven Emmerson 2018-08-24 20:09:59 UTC

(In reply to Florian Weimer from comment #23)
> I think we can just remove the sigprocmask calls from the sigpause
> implementation and call sigsuspend directly in the sigpause implementation. 
> I think sigsuspend already does what we are trying to achieve with
> sigprocmask.  We can make this change upstream and close this bug as
> NOTABUG, I think.

In the "Rationale" section of sigpause(2), the POSIX standard does say that sigpause(2) can be replaced with sigsuspend(2) -- so it would seem that the two functions should either work the same or that the glibc documentation for sigpause(2) should indicate that it's not conforming or, at least, doesn't work the same as sigsuspend(2).

Comment 26 Carlos O'Donell 2018-08-24 20:14:31 UTC

(In reply to Florian Weimer from comment #23)
> (In reply to Carlos O'Donell from comment #21)
> 
> > In a single-threaded process the process will enter the kernel and in an
> > uninterruptable fashion it will unblock the signal, and then wait.
> 
> Ah.  I still see a hang, though.  But the reason for that is that the
> original reproducer and my variant use sigset.  sigset unblocks the signal,
> which is why it is never blocked in the subprocess, so it does not matter if
> the glibc function has a race or not because the race is inherent to the
> reproducer.  Well, it helped us to find a problem in the sigpause
> implementation.

Good catch. I missed that. Yes, sigset unblocks the signal.

> (In reply to Carlos O'Donell from comment #22)
> > Should we retitle this to "Clarify atomicity semantics for sigpause in
> > glibc?" Since it seems the kernel is atomic?
> 
> I think we can just remove the sigprocmask calls from the sigpause
> implementation and call sigsuspend directly in the sigpause implementation. 
> I think sigsuspend already does what we are trying to achieve with
> sigprocmask.  We can make this change upstream and close this bug as
> NOTABUG, I think.

I don't know that we can.

The old System V interface for sigpause() (__xpg_sigpause), which conflicts with the BSD definition of sigpause(), sets is_sig == 1, and has the semantics that it permanently (as opposed to temporarily while you wait for the signal) changes the signal mask.

The only solution would be a sigpause kernel syscall that does what the old standard requires? I'm not sure that would ever be supported by upstream kernel.

I think the net result here is going to be:

* Document in POSIX that sigsuspend requires atomicity.

* Document in glibc manual that sigpause does not atomically change the signal mask and then wait, and that because of that you may need to send multiple signals to the child.

I'm not happy to change the semantics of sigpause() at this point given that existing software may just be working around this issue and expect the permanent process signal mask change.

Thoughts?

Comment 27 Carlos O'Donell 2018-08-24 20:17:39 UTC

(In reply to Steven Emmerson from comment #25)
> (In reply to Florian Weimer from comment #23)
> > I think we can just remove the sigprocmask calls from the sigpause
> > implementation and call sigsuspend directly in the sigpause implementation. 
> > I think sigsuspend already does what we are trying to achieve with
> > sigprocmask.  We can make this change upstream and close this bug as
> > NOTABUG, I think.
> 
> In the "Rationale" section of sigpause(2), the POSIX standard does say that
> sigpause(2) can be replaced with sigsuspend(2) -- so it would seem that the
> two functions should either work the same or that the glibc documentation
> for sigpause(2) should indicate that it's not conforming or, at least,
> doesn't work the same as sigsuspend(2).

My reading of the standards is that sigpause() unmaks the signal in question, and leaves it unmasked. Contrast that with sigsuspend() which only unmasks it for the duration of the pause, and the handling of the signal, and then restores the signal mask.

Comment 28 Carlos O'Donell 2018-08-24 20:21:15 UTC

I think that the issues is that POSIX changed sigpause here in issue 5:
~~~
Issue 5

    Moved from X/OPEN UNIX extension to BASE.

    The DESCRIPTION is updated to indicate that the sigpause() function restores the signal mask of the process to its original state before returning.
~~~
Where it moved into BASE, and the text was updated to say it restores the original state before returning.

The legacy glibc XPG implementation does this instead from the original SUSv2 specification:
http://pubs.opengroup.org/onlinepubs/7908799/xsh/sigpause.html

So I think that you just have to be careful what version of sigpause() you request when building your code.

Comment 29 Carlos O'Donell 2018-08-24 20:30:28 UTC

In summary:

- The standard isn't clear if sigpause and sigsuspend act atomically on the signal mask change and suspend operations or not. A clarification was filed:
http://austingroupbugs.net/view.php?id=1201

- The workaround is to use sigsuspend with a mask, since this is implemented in the kernel directly it can atomically unmask the signal and wait without a race.

- The legacy XPG/SUSv2 implementation of sigpause in glibc, which takes a signal as the argument, is not atomic with respect to the unmask and suspend, and so it does permit signals to arrive before the suspend, and that is apparently OK by the standards as they are written today. Workarounds include loop-and-retry patterns as explained in comment #7, or _exit from the signal handler.

- One solution is to implement sigpause in the kernel, but it's not likely there will be traction for fixing a legacy and obsolete API.

Does that clarify everything?

Comment 30 Carlos O'Donell 2019-06-18 20:00:20 UTC

Red Hat Enterprise Linux 7 is entering Maintenance Phase Support 1 this year and only Urgent priority bug fixes will be considered. This issue is not urgent and we are moving this to Red Hat Enterprise Linux 8 for further consideration.

Note that the Austin Group clarified the wording:
http://austingroupbugs.net/view.php?id=1201

It is now required that the operations are carried out atomically:
~~~
The sigsuspend( ) function shall atomically both replace the current signal mask of the calling thread with the set of signals pointed to by sigmask and suspend the thread until delivery of a signal whose action is either to execute a signal-catching function or to terminate the process. 
~~~
The sigpause( ) function shall atomically both remove sig from the signal mask of the calling process and suspend the calling process until a signal is delivered whose action is either to execute a signal-catching function or to terminate the process.
~~~

Which means we should fix this upstream.

Comment 31 Florian Weimer 2020-02-24 15:01:21 UTC

We are tracking this upstream now. I have assigned to the upstream bug to myself and will try to drive this towards a conclusion in a reasonable timeframe.