From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030707 Description of problem: There are unexplained sleeps in the Berkeley DB code, either in perl-DB_File or in db4 itself. I've attached a Perl script that takes 3 seconds to run on Red Hat 8, and over 3 minutes on Red Hat Enterprise Linux 3 on the same hardware! Version-Release number of selected component (if applicable): db4-4.1.25-8 How reproducible: Always Steps to Reproduce: Run the attached script on Red Hat 8 and Red Hat Enterprise Linux 3. Note the dramatic differences in speed. Then do an strace on RHEL3 and notice unexplained select(0, NULL, NULL, NULL, {1, 0}) calls all over the place. Additional info: Here's the Perl script: #!/usr/bin/perl use DB_File; sub test { my %hash; tie %hash, 'DB_File', 'test.db'; my $i; for ($i=0; $i<50000; $i++) { $hash{$i} = 98765-$i; } untie %hash; } unlink("test.db"); test();
More useful info: According to http://www.sleepycat.com/update/4.2.52/if.4.2.52.html, the 4.2 release fixed the following bug (search in the page): 10. Fix a bug where contention in the buffer pool could cause the buffer allocation algorithm to unnecessarily sleep waiting for buffers to be freed. [#7572]
db-4.2.52 is in fc2, will be in RHEL4. RHL 8 no longer supported.
So you won't fix it for RHEL3? That's simply unacceptable. We have many customers who are paying for RHEL3 because they "get support", and refusal to fix a serious bug is very bad.
Look, there is a promise of ABI compliance for the entire RHEL3 lifetime. Changing from db-4.1.25 to db-4.2.52 just ain't going to happen because there are way too many system (and RHEL3 customer) packages relying on that ABI guarantee. I also have no direct evidence that db-4.2.52 fixes your problem. Supplying a report that db-4.2.52 does fix your problem will only help escalate a "fix". Does perl linked with db-4.2.52 on RHEL3 solve your problem or not? If so, I will attempt to escalate a "fix" into RHEL3.
I don't really care if you stick with db-4.1.25, as long as this particular bug is fixed. Run the script on RH8. It works fine. Run it on RHEL3. It takes forever. Something is broken that needs fixing. I mentioned db 4.2.52 because I did some searches and thought that the info might help you narrow down the problem. The simple fact is that db-4.1.25 on a platform that Red Hat accepts lots of money to support is broken. I don't really care how it is fixes, as long as it *is* fixed. Regards, David.
Here's fc2+ timings, 2.4GHz hyperthreaded p4, 2.6.5 kernel, db-4.2.52: $ time ./td real 0m3.771s user 0m2.849s sys 0m0.544s Here's ~fc1 (closest I have to RHEL3 atm) timings, 700 MHz Dell, 2.4.22 kernel, db-4.1.25: $ time ./td real 3m12.770s user 0m7.940s sys 0m2.840s Mmmm pugly performance hit. LD_ASSUME_KERNEL=2.4.19, no change. Off to perl in case it's a known perl problem ... that's the first step to get a fix prioritized.
Very sorry for the long delay in processing this bug report . I'm not sure what fixed it - we still use db4-4.1.25 - but this problem is definitely gone in RHEL-3-U6 (kernel-2.4.21-27.EL, db4-4.1.25-8.1, perl-5.8.0-89.10, glibc-2.3.2-95.37), as shown by timed runs of your test program (116192.pl): $ time ./116192.pl real 0m3.068s user 0m2.740s sys 0m0.280s $ time ./116192.pl real 0m3.069s user 0m2.740s sys 0m0.290s $ time ./116192.pl real 0m3.075s user 0m2.720s sys 0m0.290s