Red Hat Bugzilla – Bug 104800
pthread_cond_wait CPU loop on SMP i686 with NPTL
Last modified: 2016-11-24 10:28:38 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703
Description of problem:
The attached test program (pthread_cond_wait.c) is a timing test for
pthread_cond_wait(). When run on a dual processor i686 it hangs in a
CPU loop after about 600,000 iterations. I have seen the problem in
Severn: glibc-2.3.2-57, kernel 2.4.21-20.1.2024.2.1.nptlsmp, and in Severn
with some Rawhide: glibc-2.3.2-82, kernel 2.4.22-1.2040.nptlsmp.
Severn plus Rawhide fails less frequently than Severn; the program
sometimes emerges from its CPU loop after a delay of many seconds.
The program worked fine when run on a single processor i686, and when
run on the SMP i686 with LD_ASSUME_KERNEL=2.2.5.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. gcc -lpthread -o pthread_cond_wait pthread_cond_wait.c
2. ./pthread_cond_wait 1000000
Actual Results: Step 2 hangs in a CPU loop, as observed by top, with about 87%
system CPU and 12% user CPU.
Expected Results: Step 2 should run to completion, taking about 3 microseconds
per iteration on a dual processor 1 GHz Pentium III computer.
Created attachment 94623 [details]
This is Rainer Toebbicke's pthread_cond_wait() timing test program.
Compile and link with:
gcc -lpthread -o pthread_cond_wait pthread_cond_wait.c
where 1000000 is the number of iterations to be performed.
Could reproduce this with glibc-2.3.2-71 on my box, cannot with glibc-2.3.2-91
(-90 features NPTL locking changes).
I have retried the pthread_cond_wait test on an i686 SMP computer with
glibc-2.3.2-91 and kernel 2.4.22-1.2061.nptlsmp from Rawhide. The test now
hangs the system, in such a way that ping to the computer fails. (The test
computer is in the computer centre. I use ssh to login to it.)
Tbat kernel is known to have a SMP scheduler bug.
Can you retry on a fixed kernel (or taroon one)?
I have retried the pthread_cond_wait test under Fedora Core test 0.94 with some
Rawhide: kernel 2.4.22-1.2082.nptlsmp, glibc-2.3.2-97. The test runs correctly.