Bug 104800 - pthread_cond_wait CPU loop on SMP i686 with NPTL
Summary: pthread_cond_wait CPU loop on SMP i686 with NPTL
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Red Hat Linux Beta
Classification: Retired
Component: glibc
Version: beta1
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-09-22 09:57 UTC by Julian Blake
Modified: 2016-11-24 15:28 UTC (History)
1 user (show)

Fixed In Version: 2.3.2-90
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2003-10-03 11:09:57 UTC
Embargoed:


Attachments (Terms of Use)
pthread_cond_wait.c (2.62 KB, text/plain)
2003-09-22 10:11 UTC, Julian Blake
no flags Details

Description Julian Blake 2003-09-22 09:57:55 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703

Description of problem:
The attached test program (pthread_cond_wait.c) is a timing test for 
pthread_cond_wait().  When run on a dual processor i686 it hangs in a 
CPU loop after about 600,000 iterations.  I have seen the problem in
Severn: glibc-2.3.2-57, kernel 2.4.21-20.1.2024.2.1.nptlsmp, and in Severn 
with some Rawhide: glibc-2.3.2-82, kernel 2.4.22-1.2040.nptlsmp.  
Severn plus Rawhide fails less frequently than Severn; the program 
sometimes emerges from its CPU loop after a delay of many seconds.

The program worked fine when run on a single processor i686, and when 
run on the SMP i686 with LD_ASSUME_KERNEL=2.2.5.

Version-Release number of selected component (if applicable):
glibc-2.3.2-57

How reproducible:
Sometimes

Steps to Reproduce:
1. gcc -lpthread -o pthread_cond_wait pthread_cond_wait.c
2. ./pthread_cond_wait 1000000
    

Actual Results:  Step 2 hangs in a CPU loop, as observed by top, with about 87%
system CPU and 12% user CPU.

Expected Results:  Step 2 should run to completion, taking about 3 microseconds
per iteration on a dual processor 1 GHz Pentium III computer.

Additional info:

Comment 1 Julian Blake 2003-09-22 10:11:55 UTC
Created attachment 94623 [details]
pthread_cond_wait.c

This is Rainer Toebbicke's pthread_cond_wait() timing test program.

Compile and link with:
gcc -lpthread -o pthread_cond_wait pthread_cond_wait.c

Run with:
./pthread_cond_wait 1000000
where 1000000 is the number of iterations to be performed.

Comment 2 Jakub Jelinek 2003-09-23 15:58:40 UTC
Could reproduce this with glibc-2.3.2-71 on my box, cannot with glibc-2.3.2-91
(-90 features NPTL locking changes).

Comment 3 Julian Blake 2003-09-25 10:15:18 UTC
I have retried the pthread_cond_wait test on an i686 SMP computer with
glibc-2.3.2-91 and kernel 2.4.22-1.2061.nptlsmp from Rawhide.  The test now
hangs the system, in such a way that ping to the computer fails.  (The test
computer is in the computer centre.  I use ssh to login to it.)

Comment 4 Jakub Jelinek 2003-09-26 14:08:43 UTC
Tbat kernel is known to have a SMP scheduler bug.
Can you retry on a fixed kernel (or taroon one)?

Comment 5 Julian Blake 2003-10-03 10:33:00 UTC
I have retried the pthread_cond_wait test under Fedora Core test 0.94 with some
Rawhide: kernel 2.4.22-1.2082.nptlsmp, glibc-2.3.2-97.  The test runs correctly.


Note You need to log in before you can comment on or make changes to this bug.