We are experiencing a problem that is unique to the Linux S390 systems running RedHat (xSeries running RedHat are ok). When the workloads start they do a bunch of pthread_creates (various by type of workload started) . Each thread will run until it is terminated by the user or an error is encountered. When attempting to create the 103 thread, pthread_create fails with a return code of 12 (ENOMEM). I've attached a 'c' program that reproduces the failure. The function that is invoked on the pthread_create is not longed lived (just a simple loop). I compile it with gcc -lpthread -g -o mythread mythread.c . #ifndef _OPEN_THREADS #define _OPEN_THREADS #endif #define _POSIX_SOURCE #define _POSIX_C_SOURCE 3 #define __EXTENSIONS__ #include <pthread.h> #include <stdarg.h> #include <stdio.h> #include <stdlib.h> typedef struct my_struct { pthread_t run_id; } my_struct; void *runit(void *arg); int main(int argc, char **argv) { int rc = 0; register unsigned int lp_cnt; my_struct parm; pthread_attr_t attr; pthread_t thread_id = pthread_self(); pthread_attr_init(&attr); pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED); for(lp_cnt = 1; lp_cnt < 0xffff; lp_cnt++) { parm.run_id = lp_cnt; rc = pthread_create(&thread_id, &attr, runit, (void *)&parm); if (rc != 0) { printf("bad rc %d from pthread_create on run_id %d\n", rc, lp_cnt); exit(-1); } } } void *runit(void *arg) { my_struct *parm = (my_struct *)arg; register unsigned int lp_cnt; printf("runit started for run_id %d\n", parm->run_id); for(lp_cnt = 0; lp_cnt < 0xffffffff; lp_cnt++); printf("runit ended for run_id %d\n", parm->run_id); fflush(stdout); }
The testcase is terribly buggy, e.g. parm is a shared variable while you are clearly relying on it being thread local. Also, you don't tweak the thread stack size in any way (no pthread_attr_setstacksize call), so you get the default thread stack size, which is huge and on s390 which has only 2GB of address space you quickly fill in the whole address space with the thread stack mappings.