Bug 54229 - Call to pthread_create() occasionally hangs, creating a defunct thread
Call to pthread_create() occasionally hangs, creating a defunct thread
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.1
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Jakub Jelinek
Aaron Brown
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-10-01 20:49 EDT by Need Real Name
Modified: 2007-04-18 12:37 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2002-12-15 12:50:42 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Need Real Name 2001-10-01 20:49:51 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)

Description of problem:
The test case creates 30 threads which printf a short message and return 
value from thread_self() each (example output below). Every once in a 
while the program hangs, and when it does it is always in the very first 
call to pthread_create. ps shows two threads, one of which is the main 
therad, and the other defunct. Its creation didn't complete since the call 
to pthread_create is hanging. Attaching to the main thread's process, bt 
shows:

#0  0x400d58a5 in __sigsuspend (set=0xbffff650) 
at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
#1  0x4009c0d9 in __pthread_wait_for_restart_signal (self=0x400a4e80) at 
pthread.c:934
#2  0x4009c19f in __pthread_create_2_1 (thread=0xbffff7d0, attr=0x0,
    start_routine=0x8048d00 <routine>, arg=0x0) at restart.h:34
#3  0x08048e7f in main () at lock.cpp:64
#4  0x400c4177 in __libc_start_main (main=0x8048e60 <main>, argc=1, 
ubp_av=0xbffff86c,
    init=0x80489fc <_init>, fini=0x8049530 <_fini>, rtld_fini=0x4000e184 
<_dl_fini>,
    stack_end=0xbffff85c) at ../sysdeps/generic/libc-start.c:129     

Version-Release number of selected component (if applicable):
glibc-2.2.2-10

How reproducible:
Sometimes

Steps to Reproduce:
1. Make the executable from lt.cpp (below):
   g++ -g -pthread -o lt lt.cpp
(The problem occurs both in debug mode, and  not, at all levels of 
optimization, as well as without it.)

2. Run the program in a loop, until it hangs. I usually do a:
   while [ 1 ]; do ./lt; echo okay...; done

3. In another xterm session do a ps. You should see two threads for the lt 
executable. One of them is defunct.
	
Actual Results:  It may take several hundred runs for the problem to 
occur. When it does, the program hangs before any output occurs. Attaching 
gdb to the hanging program, examining data shows that the program hangs in 
the very first call to pthread_create.

Expected Results:  This is the normal output:

Hello from routine, thread: 1026
Hello from routine, thread: 2051
Hello from routine, thread: 3076
Hello from routine, thread: 4101
Hello from routine, thread: 5126
Hello from routine, thread: 6151
Hello from routine, thread: 7176
Hello from routine, thread: 8201
Hello from routine, thread: 9226
Hello from routine, thread: 10251
Hello from routine, thread: 11276
Hello from routine, thread: 12301
Hello from routine, thread: 13326
Hello from routine, thread: 14351
Hello from routine, thread: 15376
Hello from routine, thread: 16401
Hello from routine, thread: 17426
Hello from routine, thread: 18451
Hello from routine, thread: 19476
Hello from routine, thread: 20501
Hello from routine, thread: 21526
Hello from routine, thread: 22551
Hello from routine, thread: 23576
Hello from routine, thread: 24601
Hello from routine, thread: 25626
Hello from routine, thread: 26651
Hello from routine, thread: 27676
Hello from routine, thread: 28701
Hello from routine, thread: 29726
Hello from routine, thread: 30751 

Additional info:

The test case is very short:

//------------- lt.cpp -------------
#include <pthread.h>
#include <stdio.h>

extern "C" void* routine(void*)
{
  printf("Hello from routine, thread: %d\n", pthread_self());
  return 0;
}

int main(void)
{
  const int threads = 30;
  pthread_t thread[threads];

  for(int i = 0; i < threads; ++i)
    pthread_create(&thread[i], 0, routine, 0);

  for(int i = 0; i < threads; ++i)
    pthread_join(thread[i], 0);
  
  return 0;
}
//------------- lt.cpp -------------


Platform:
$> uname -a
Linux macaroni 2.4.2-2smp #1 SMP Sun Apr 8 20:21:34 EDT 2001 i686 unknown

$> g++ -v
Reading specs from /package/1/compilers/gcc-2.96-81/bin/../lib/gcc-
lib/i686-pc-linux/2.96/specs
gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-81)

$> rpm -qa | grep glibc
glibc-2.2.2-10
glibc-devel-2.2.2-10
glibc-profile-2.2.2-10
glibc-common-2.2.2-10
compat-glibc-6.2-2.1.3.2

$> cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 10
model name	: Pentium III (Cascades)
stepping	: 1
cpu MHz	: 699.331
cache size	: 1024 KB
fdiv_bug	: no
hlt_bug	        : no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
cmov pat pse36 mmx fxsr sse
bogomips	: 1395.91

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 10
model name	: Pentium III (Cascades)
stepping	: 1
cpu MHz         : 699.331
cache size	: 1024 KB
fdiv_bug	: no
hlt_bug	        : no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
cmov pat pse36 mmx fxsr sse
bogomips	: 1395.91
Comment 1 Jakub Jelinek 2001-10-04 13:56:33 EDT
Can you reproduce it with
ftp://people.redhat.com/jakub/glibc/2.2.4-18/
? I cannot. My guess is this is similar to bug #43742.
Comment 2 Need Real Name 2001-10-08 14:20:35 EDT
I can still reproduce it using the suggested version of glibc:

$> rpm -qa | grep glibc
glibc-devel-2.2.4-18
glibc-common-2.2.4-18
compat-glibc-6.2-2.1.3.2
glibc-2.2.4-18
glibc-profile-2.2.4-18

The program still hangs intermittently. Attaching gdb to it bt produces the  
output that's essentially the same as before:

(gdb) bt
#0  0x40070b75 in __sigsuspend (set=0xbffff700) 
at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
#1  0x400361c9 in __pthread_wait_for_restart_signal (self=0x4003ef40) at 
pthread.c:969
#2  0x4003629c in __pthread_create_2_1 (thread=0xbffff880, attr=0x0, 
    start_routine=0x8048560 <routine(void *)>, arg=0x0) at restart.h:34
#3  0x080485d7 in main () at lt.cpp:17
#4  0x4005e617 in __libc_start_main (main=0x804858c <main>, argc=1, 
ubp_av=0xbffff984, 
    init=0x80483b0 <_init>, fini=0x8048680 <_fini>, rtld_fini=0x4000dcc4 
<_dl_fini>, 
    stack_end=0xbffff97c) at ../sysdeps/generic/libc-start.c:129

(gdb) frame 3
#3  0x080485d7 in main () at lt.cpp:17
19          pthread_create(&thread[i], 0, routine, 0);
Current language:  auto; currently c++

In addition, I have not been able to reproduce the problem on a single 
processor machine, only on a dual processor one.
Comment 3 Jakub Jelinek 2001-10-09 03:55:15 EDT
Ok, another guess: IA-32 SMP LDT bug in the kernel.
This was fixed in Linus' 2.4.8, or e.g. in Red Hat 2.4.7-10.
Can you please try it? Note that if this was the case, the hangs should go
away even in 2.4.2 when running with LD_ASSUME_KERNEL=2.2.5

Note You need to log in before you can comment on or make changes to this bug.