Hide Forgot
#include <pthread.h> #include <stdlib.h> #include <ucontext.h> #include <unistd.h> #include <stdio.h> __thread int tls; static char stack[10 * 1024 * 1024]; static ucontext_t c; /* Called via makecontext/setcontext. */ static void cfn (void) { exit (tls); } /* Called via pthread_create. */ static void * tfn (void *dummy) { /* The thread should still see this value after calling setcontext. */ tls = 0; setcontext (&c); fprintf (stderr, "setcontext returned %m\n"); /* The call to setcontext should not return. */ abort (); } int main () { pthread_t tid; /* The thread should not see this value. */ tls = 1; if (getcontext (&c) < 0) abort (); c.uc_stack.ss_sp = stack; #ifdef MAKECONTEXT_STACK_TOP c.uc_stack.ss_sp += sizeof stack; #endif c.uc_stack.ss_flags = 0; c.uc_stack.ss_size = sizeof stack; c.uc_link = NULL; makecontext (&c, cfn, 0); if (pthread_create (&tid, NULL, tfn, NULL) != 0) abort (); if (pthread_join (tid, NULL) != 0) abort (); /* The thread should have called exit. */ abort (); } started failing on ppc32 in between glibc-2.17-29.el7.ppc and glibc-2.17-32.el7.ppc. The above test is distilled from gcc configury, and causes a gcc bootstrap error. The problem is that setcontext call fails with -1/EINVAL. This is a blocker for building fixed gcc with the asm goto fix.
This is a known problem and upstream thinks that the kernel should not reject the syscall (as it currently does) when the MSR_VSX bit is set. https://sourceware.org/ml/libc-alpha/2013-10/msg00117.html I don't know if there has been additional discussion around this on lkml or elsewhere. Maybe Carlos has more information since he was tracking this.
Can we revert the power7 change in glibc until this is resolved? This is really a blocker for me. Is it really a kernel problem? I mean, if the 1184 size is smaller than what kernel requires for saving/restoring of VSX state, then either we need a new symbol version of the *context family for ppc32 (what about ppc64?, does that include the vsx state), or, if the upper halves of the VSX registers (is that what is missing in *context) are supposed to be call clobbered, then perhaps just the *context family of functions should explicitly clear the bit in the context indicating need to save resp. restore VSX state. Or perhaps *context.4 should clear that bit always and new *context entrypoints with larger struct ucontext would be the only one to guarantee preservation of the upper part.
I've added some suggestion to BZ#6816, though thinking about it more, while it will work for getcontext/setcontext, swapcontext actually doesn't return to where it could clear the MSR_VSX bit in gregs[PT_MSR], so supposedly it should be setcontext/swapcontext that would clear the bit in the new context (but, I guess it can't clear it in place, because it could be called with ucontext from signal handler), so perhaps getcontext should clear the bit upon success, setcontext/swapcontext check the bit in the new structure and if not set, go to a slower path that copies the structure to a a buffer on the stack and clears the bit there and passes address of the stack buffer to the syscall instead. Or, if we are going to change the kernel, perhaps kernel should clear the MSR_VSX bit when writing old context in swapcontext syscall with size not big enough to store the state, and if MSR_VSX bit is set in the new context, silently assume larger context size for the new context (but not the old context) - then the only way to get there a new context with MSR_VSX bit set would be if it is passed the ucontext from the signal frame, in which case it would be best to restore the VSX state.
This is now fixed by using POWER6 code generation for the POWER7 and POWER8 32-bit runtimes. We still tune for POWER7 and POWER8 respectively. Once bug 1019549 is fixed we will revert this change and go back to POWER7 code generation for both runtimes.
This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request.