Bug 1018072 - setcontext broken on ppc32
setcontext broken on ppc32
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: glibc (Show other bugs)
7.0
ppc Linux
high Severity high
: rc
: ---
Assigned To: Carlos O'Donell
Arjun Shankar
:
Depends On:
Blocks: 1017704
  Show dependency treegraph
 
Reported: 2013-10-11 02:57 EDT by Jakub Jelinek
Modified: 2016-11-24 07:35 EST (History)
8 users (show)

See Also:
Fixed In Version: glibc-2.17-34
Doc Type: Known Issue
Doc Text:
The 32-bit Power PC runtimes, specifically the C library setcontext(), getcontext(), makecontext(), and swapcontext() functions fail if an application makes use of the Vector-Scalar Extension (VSX) registers in any way that alters VSX state. The defect itself is a problem in the Linux kernel swapcontext() system call. To avoid this problem, do not compile with VSX enabled.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-06-13 06:07:31 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jakub Jelinek 2013-10-11 02:57:24 EDT
#include <pthread.h>
#include <stdlib.h>
#include <ucontext.h>
#include <unistd.h>
#include <stdio.h>

__thread int tls;

static char stack[10 * 1024 * 1024];
static ucontext_t c;

/* Called via makecontext/setcontext.  */

static void
cfn (void)
{
  exit (tls);
}

/* Called via pthread_create.  */

static void *
tfn (void *dummy)
{
  /* The thread should still see this value after calling
     setcontext.  */
  tls = 0;

  setcontext (&c);
  fprintf (stderr, "setcontext returned %m\n");
  /* The call to setcontext should not return.  */
  abort ();
}

int
main ()
{
  pthread_t tid;

  /* The thread should not see this value.  */
  tls = 1;

  if (getcontext (&c) < 0)
    abort ();

  c.uc_stack.ss_sp = stack;
#ifdef MAKECONTEXT_STACK_TOP
  c.uc_stack.ss_sp += sizeof stack;
#endif
  c.uc_stack.ss_flags = 0;
  c.uc_stack.ss_size = sizeof stack;
  c.uc_link = NULL;
  makecontext (&c, cfn, 0);

  if (pthread_create (&tid, NULL, tfn, NULL) != 0)
    abort ();

  if (pthread_join (tid, NULL) != 0)
    abort ();

  /* The thread should have called exit.  */
  abort ();
}

started failing on ppc32 in between glibc-2.17-29.el7.ppc and glibc-2.17-32.el7.ppc.  The above test is distilled from gcc configury, and causes a gcc bootstrap error.  The problem is that setcontext call
fails with -1/EINVAL.

This is a blocker for building fixed gcc with the asm goto fix.
Comment 3 Siddhesh Poyarekar 2013-10-11 03:16:19 EDT
This is a known problem and upstream thinks that the kernel should not reject the syscall (as it currently does) when the MSR_VSX bit is set.

https://sourceware.org/ml/libc-alpha/2013-10/msg00117.html

I don't know if there has been additional discussion around this on lkml or elsewhere.  Maybe Carlos has more information since he was tracking this.
Comment 4 Jakub Jelinek 2013-10-11 03:31:50 EDT
Can we revert the power7 change in glibc until this is resolved?  This is really a blocker for me.
Is it really a kernel problem?  I mean, if the 1184 size is smaller than what kernel requires for saving/restoring of VSX state, then either we need a new symbol version of the *context family for ppc32 (what about ppc64?, does that include the vsx state), or, if the upper halves of the VSX registers (is that what is missing in *context) are supposed to be call clobbered, then perhaps just the *context family of functions should explicitly clear the bit in the context indicating need to save resp. restore VSX state.
Or perhaps *context@GLIBC_2.3.4 should clear that bit always and new *context entrypoints with larger struct ucontext would be the only one to guarantee preservation of the upper part.
Comment 5 Jakub Jelinek 2013-10-11 04:23:35 EDT
I've added some suggestion to BZ#6816, though thinking about it more, while it will work for getcontext/setcontext, swapcontext actually doesn't return to where it could clear the MSR_VSX bit in gregs[PT_MSR], so supposedly it should be
setcontext/swapcontext that would clear the bit in the new context (but, I guess it can't clear it in place, because it could be called with ucontext from signal handler), so perhaps getcontext should clear the bit upon success, setcontext/swapcontext check the bit in the new structure and if not set, go to a slower path that copies the structure to a a buffer on the stack and clears the bit there and passes address of the stack buffer to the syscall instead.

Or, if we are going to change the kernel, perhaps kernel should clear the MSR_VSX bit when writing old context in swapcontext syscall with size not big enough to store the state, and if MSR_VSX bit is set in the new context, silently assume larger context size for the new context (but not the old context) - then the only
way to get there a new context with MSR_VSX bit set would be if it is passed the ucontext from the signal frame, in which case it would be best to restore the VSX state.
Comment 13 Carlos O'Donell 2013-10-18 11:09:04 EDT
This is now fixed by using POWER6 code generation for the POWER7 and POWER8 32-bit runtimes. We still tune for POWER7 and POWER8 respectively. Once bug 1019549 is fixed we will revert this change and go back to POWER7 code generation for both runtimes.
Comment 18 Ludek Smid 2014-06-13 06:07:31 EDT
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Note You need to log in before you can comment on or make changes to this bug.