Bug 104800 - pthread_cond_wait CPU loop on SMP i686 with NPTL
pthread_cond_wait CPU loop on SMP i686 with NPTL
Status: CLOSED RAWHIDE
Product: Red Hat Linux Beta
Classification: Retired
Component: glibc (Show other bugs)
beta1
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Jakub Jelinek
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-09-22 05:57 EDT by Julian Blake
Modified: 2016-11-24 10:28 EST (History)
1 user (show)

See Also:
Fixed In Version: 2.3.2-90
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2003-10-03 07:09:57 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
pthread_cond_wait.c (2.62 KB, text/plain)
2003-09-22 06:11 EDT, Julian Blake
no flags Details

  None (edit)
Description Julian Blake 2003-09-22 05:57:55 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703

Description of problem:
The attached test program (pthread_cond_wait.c) is a timing test for 
pthread_cond_wait().  When run on a dual processor i686 it hangs in a 
CPU loop after about 600,000 iterations.  I have seen the problem in
Severn: glibc-2.3.2-57, kernel 2.4.21-20.1.2024.2.1.nptlsmp, and in Severn 
with some Rawhide: glibc-2.3.2-82, kernel 2.4.22-1.2040.nptlsmp.  
Severn plus Rawhide fails less frequently than Severn; the program 
sometimes emerges from its CPU loop after a delay of many seconds.

The program worked fine when run on a single processor i686, and when 
run on the SMP i686 with LD_ASSUME_KERNEL=2.2.5.

Version-Release number of selected component (if applicable):
glibc-2.3.2-57

How reproducible:
Sometimes

Steps to Reproduce:
1. gcc -lpthread -o pthread_cond_wait pthread_cond_wait.c
2. ./pthread_cond_wait 1000000
    

Actual Results:  Step 2 hangs in a CPU loop, as observed by top, with about 87%
system CPU and 12% user CPU.

Expected Results:  Step 2 should run to completion, taking about 3 microseconds
per iteration on a dual processor 1 GHz Pentium III computer.

Additional info:
Comment 1 Julian Blake 2003-09-22 06:11:55 EDT
Created attachment 94623 [details]
pthread_cond_wait.c

This is Rainer Toebbicke's pthread_cond_wait() timing test program.

Compile and link with:
gcc -lpthread -o pthread_cond_wait pthread_cond_wait.c

Run with:
./pthread_cond_wait 1000000
where 1000000 is the number of iterations to be performed.
Comment 2 Jakub Jelinek 2003-09-23 11:58:40 EDT
Could reproduce this with glibc-2.3.2-71 on my box, cannot with glibc-2.3.2-91
(-90 features NPTL locking changes).
Comment 3 Julian Blake 2003-09-25 06:15:18 EDT
I have retried the pthread_cond_wait test on an i686 SMP computer with
glibc-2.3.2-91 and kernel 2.4.22-1.2061.nptlsmp from Rawhide.  The test now
hangs the system, in such a way that ping to the computer fails.  (The test
computer is in the computer centre.  I use ssh to login to it.)
Comment 4 Jakub Jelinek 2003-09-26 10:08:43 EDT
Tbat kernel is known to have a SMP scheduler bug.
Can you retry on a fixed kernel (or taroon one)?
Comment 5 Julian Blake 2003-10-03 06:33:00 EDT
I have retried the pthread_cond_wait test under Fedora Core test 0.94 with some
Rawhide: kernel 2.4.22-1.2082.nptlsmp, glibc-2.3.2-97.  The test runs correctly.

Note You need to log in before you can comment on or make changes to this bug.