Bug 116192 - Unexplained sleeps in Berkeley DB
Summary: Unexplained sleeps in Berkeley DB
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: perl
Version: 3.0
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jason Vas Dias
QA Contact: David Lawrence
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-02-18 21:09 UTC by DIanne Skoll
Modified: 2007-11-30 22:07 UTC (History)
0 users

Fixed In Version: perl-5.8.0-89.10
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-11-10 19:04:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description DIanne Skoll 2004-02-18 21:09:52 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030707

Description of problem:
There are unexplained sleeps in the Berkeley DB code, either in
perl-DB_File or in db4 itself.  I've attached a Perl script that takes
3 seconds to run on Red Hat 8, and over 3 minutes on Red Hat
Enterprise Linux 3 on the same hardware!


Version-Release number of selected component (if applicable):
db4-4.1.25-8

How reproducible:
Always

Steps to Reproduce:
Run the attached script on Red Hat 8 and Red Hat Enterprise Linux 3. 
Note the dramatic differences in speed.  Then do an strace on RHEL3
and notice unexplained select(0, NULL, NULL, NULL, {1, 0}) calls all
over the place.


Additional info:

Here's the Perl script:

#!/usr/bin/perl
use DB_File;

sub test {
    my %hash;
    tie %hash, 'DB_File', 'test.db';
    my $i;
    for ($i=0; $i<50000; $i++) {
        $hash{$i} = 98765-$i;
    }
    untie %hash;
}
unlink("test.db");
test();

Comment 1 DIanne Skoll 2004-02-18 21:21:05 UTC
More useful info:  According to
http://www.sleepycat.com/update/4.2.52/if.4.2.52.html, the 4.2 release
fixed the following bug (search in the page):

10. Fix a bug where contention in the buffer pool could cause the
buffer allocation algorithm to unnecessarily sleep waiting for buffers
to be freed. [#7572]

Comment 2 Jeff Johnson 2004-06-10 01:35:36 UTC
db-4.2.52 is in fc2, will be in RHEL4. RHL 8 no longer supported.

Comment 3 DIanne Skoll 2004-06-10 11:26:13 UTC
So you won't fix it for RHEL3?  That's simply unacceptable.  We have
many customers who are paying for RHEL3 because they "get support",
and refusal to fix a serious bug is very bad.

Comment 4 Jeff Johnson 2004-06-10 21:44:09 UTC
Look, there is a promise of ABI compliance for the
entire RHEL3 lifetime. Changing from db-4.1.25 to db-4.2.52
just ain't going to happen because there are way too many
system (and RHEL3 customer) packages relying on that ABI guarantee.

I also have no direct evidence that db-4.2.52 fixes your problem.
Supplying a report that db-4.2.52 does fix your problem
will only help escalate a "fix".

Does perl linked with db-4.2.52 on RHEL3 solve your problem
or not? If so, I will attempt to escalate a "fix" into RHEL3.

Comment 5 DIanne Skoll 2004-06-11 01:22:21 UTC
I don't really care if you stick with db-4.1.25, as long as this
particular bug is fixed.

Run the script on RH8.  It works fine.  Run it on RHEL3.  It takes
forever.

Something is broken that needs fixing.  I mentioned db 4.2.52 because
I did some searches and thought that the info might help you narrow
down the problem.

The simple fact is that db-4.1.25 on a platform that Red Hat accepts
lots of money to support is broken.  I don't really care how it is
fixes, as long as it *is* fixed.

Regards,

David.


Comment 6 Jeff Johnson 2004-06-11 05:04:12 UTC
Here's fc2+ timings, 2.4GHz hyperthreaded p4, 2.6.5 kernel, db-4.2.52:
$ time ./td
 
real    0m3.771s
user    0m2.849s
sys     0m0.544s

Here's ~fc1 (closest I have to RHEL3 atm) timings, 700 MHz Dell,
2.4.22 kernel, db-4.1.25:
$ time ./td
 
real    3m12.770s
user    0m7.940s
sys     0m2.840s

Mmmm pugly performance hit. LD_ASSUME_KERNEL=2.4.19, no change.

Off to perl in case it's a known perl problem ... that's the
first step to get a fix prioritized.

Comment 7 Jason Vas Dias 2005-11-10 19:04:26 UTC
Very sorry for the long delay in processing this bug report .

I'm not sure what fixed it - we still use db4-4.1.25 - but this problem is
definitely gone in RHEL-3-U6 (kernel-2.4.21-27.EL, db4-4.1.25-8.1, 
perl-5.8.0-89.10, glibc-2.3.2-95.37), as shown by timed runs of your
test program (116192.pl):

$ time ./116192.pl

real    0m3.068s
user    0m2.740s
sys     0m0.280s
$ time ./116192.pl

real    0m3.069s
user    0m2.740s
sys     0m0.290s
$ time ./116192.pl

real    0m3.075s
user    0m2.720s
sys     0m0.290s


Note You need to log in before you can comment on or make changes to this bug.