Bug 129726 - hidden error in DB_File::untie causes file descriptor leak
Summary: hidden error in DB_File::untie causes file descriptor leak
Alias: None
Product: Fedora
Classification: Fedora
Component: perl   
(Show other bugs)
Version: rawhide
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Warren Togami
QA Contact:
Depends On:
TreeView+ depends on / blocked
Reported: 2004-08-12 07:08 UTC by Alexandre Oliva
Modified: 2007-11-30 22:10 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2005-08-22 05:00:51 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
spamd debugging output (32.78 KB, application/x-bzip2)
2004-08-15 20:17 UTC, Alexandre Oliva
no flags Details

Description Alexandre Oliva 2004-08-12 07:08:50 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2)

Description of problem:
[root@livre ~]# ps 3617
 3617 ?        S      2:17 spamd child
[root@livre ~]# lsof -p 3617
spamd   3617 root   87u   REG      253,0 10821632 2852859
spamd   3617 root   88u   REG      253,0 10821632 2852859
spamd   3617 root   89u   REG      253,0 10821632 2852859
spamd   3617 root   90u   REG      253,0 10821632 2852859
spamd   3617 root   91u   REG      253,0 10821632 2852859

Same for all other spamd processes, without any e-mail delivery having
taken place for the past several minutes.

I don't think it should be keeping this file open, especially not so
many times.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.Feed a lot of email for yourself to spamc
2.Run lsof on the spamd processes

Actual Results:  Lots of descriptors pointing to bayes_tosk.

Expected Results:  None, unless mail is being delivered.

Additional info:

I suspect it might eventually exhaust the number of file descriptors
available and start failing to check for spam, especially because when
spamd fails spamc doesn't fall back to spamassassin as I'd hope.

Comment 1 Justin Mason 2004-08-12 16:50:03 UTC
does the number of open fds increase as more messages are scanned?
also, could you attach the output of spamd with the -D switch?

Comment 2 Alexandre Oliva 2004-08-15 18:23:13 UTC
It grows as more messages are scanned, yes.  After being left running
overnight, which probably amounts to 2-3k messages, I had 451 open
descriptors pointing to bayes_toks (surely excessive even if it was
just for caching; more than one per process is absolutely pointless IMHO).

I checked that the local mail queue as empty and restarted the
spamassassin service.  At this point, I had 0 bayes_toks file
descriptors open.  Then, I got fetchmail running again, and it brought
in 51 messages.  After they had all be delivered, I had 41 bayes_toks
opened among spamd processes.  I'll look into how to get spamd started
with the -D flag.

Comment 3 Justin Mason 2004-08-15 18:45:30 UTC
ok, that sounds bad.  also, could you check the versions of the
following packages:


Comment 4 Justin Mason 2004-08-15 18:47:50 UTC
btw, this probably exists as a bug upstream.  it might be better to
open an issue on http://bugzilla.spamassassin.org/ accordingly.

Comment 5 Theo Van Dinter 2004-08-15 18:53:10 UTC
I commented to this on 8-12, but my comment apparently never made it
into the ticket.  :(   What I wrote was:

This looks exactly like
http://bugzilla.spamassassin.org/show_bug.cgi?id=3326 and here's my
post explaining what I found:


In short, there's apparently a bug in DB_File/libdb which causes
untie() to fail internally, not throwing an error and also not closing
the fd.  Doing a "db_upgrade" or "db_dump|db_load" to upgrade the file
to the latest DB version fixed the issue for me.

Comment 6 Justin Mason 2004-08-15 18:58:52 UTC
So running "db_verify bayes_toks" may be interesting, based on what
Theo had seen in bug 3326:

<felicity> db_verify: Page 3981: non-empty page in unused hash bucket 3333
<felicity> db_verify: Page 0: page 1273 encountered a second time on
free list
<felicity> db_verify: DB->verify: bayes_seen: DB_VERIFY_BAD: Database
verification failed

Comment 7 Alexandre Oliva 2004-08-15 20:17:42 UTC
Created attachment 102750 [details]
spamd debugging output

This is a log of the delivery of 15 e-mails, with spamd started with the
following arguments: -D -c -m1 -H

At the end, there were 9 file descriptors associated with my bayes_toks file.

Comment 8 Justin Mason 2004-08-15 20:42:33 UTC
looks a lot like what Theo found, then, since all the "untie-ing
db_toks" lines are there, valid, and do not indicate any errors from

could you try the db_verify operation?

Comment 9 Alexandre Oliva 2004-08-15 20:52:07 UTC
db_verify failed.  This may explain it.  I'm running sa-learn --import
to recreate the databases, then I'll restart spamd and see if it stops
leaking fds.  If so, this should probably get reassigned to perl-DBI.

# rpm -q perl perl-DBI db4 db4-devel

Hmm...  sa-learn --reassign didn't create a database that passed
db_verify like I hoped.  But db_dump|db_load did, so I'm going with
that.  I suppose the database corruption may have been caused by
faulty memory/kernel/firewire controller/whatever that has plagued my
desktop box.  I'll use my notebook for the next few days and verify
that the database remains consistent; I wouldn't blame the database
package for the corruption for now.

Comment 10 Alexandre Oliva 2004-08-15 20:53:50 UTC
Err...   I meant sa-learn --import.  --reassign' was myself thinking
about reassigning the bug report :-)

Comment 11 Alexandre Oliva 2004-08-15 20:57:45 UTC
Looks like this fixed it.  I'm guessing as to whether perl-DBI is the
component to blame.  Please reassign if you know any better.  Thanks
to those who helped track it down.

Comment 12 Justin Mason 2004-08-15 22:38:13 UTC
actually, it's just the main perl package -- the DB_File module is
part of that now.  reassigning

Comment 13 Warren Togami 2005-05-28 06:30:13 UTC
Is this still an issue in FC3, RHEL4, or FC4?

Comment 14 Warren Togami 2005-08-22 05:00:51 UTC
No response in 3 months, assuming fixed.  REOPEN if this is still an issue with
FC3+ or RHEL4.

Note You need to log in before you can comment on or make changes to this bug.