Bug 523605 - slapd continues to open /lib64/libnspr4.so until it runs out of available file descriptors and stops accepting connections.
Summary: slapd continues to open /lib64/libnspr4.so until it runs out of available fil...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: openldap
Version: 11
Hardware: x86_64
OS: Linux
low
high
Target Milestone: ---
Assignee: Jan Zeleny
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-09-16 06:02 UTC by Bryan Burke
Modified: 2010-05-28 08:35 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-05-28 08:35:12 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Configuration file for slapd, in case its relevant. (1.15 KB, text/plain)
2009-09-16 06:09 UTC, Bryan Burke
no flags Details
strace of one of the threads (2.15 MB, application/octet-stream)
2009-10-15 17:52 UTC, Bryan Burke
no flags Details
The other thread's strace (2.14 MB, application/octet-stream)
2009-10-15 17:56 UTC, Bryan Burke
no flags Details
slapd traces and messages (637.40 KB, application/x-bzip2)
2009-12-15 20:07 UTC, Bryan Burke
no flags Details

Description Bryan Burke 2009-09-16 06:02:43 UTC
User-Agent:       Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10_4_11; en) AppleWebKit/531.9 (KHTML, like Gecko) Version/4.0.3 Safari/531.9

I upgraded my machine from Fedora 10 to 11 (clean install), put the files back in place, then one day, I noticed that my LDAP server had stopped responding to clients' queries. After looking into it, I found that it had run out of available file descriptors. The reason for this is that it seems to keep opening this library, /lib64/libnspr4.so, and then not close it.

I'm running this on a virtual machine with this kernel version:
Linux 2.6.24-22-xen

Reproducible: Always

Steps to Reproduce:
1. Start slapd (normal options: "-u ldap -h ldap:///")
2. Wait.
Actual Results:  
slapd will eventually run out of file descriptors, which then causes it to stop accepting connections from clients (presumably the accept(2) system call fails). I restart the process (service ldap restart), and its reset and fine for a little while again.

Expected Results:  
It shouldn't stop responding to clients.

None really. I have a pretty standard configuration file. I'll include it as an attachment for reference as soon as I figure out how (this is the first bug report I've filed).

Comment 1 Bryan Burke 2009-09-16 06:09:15 UTC
Created attachment 361192 [details]
Configuration file for slapd, in case its relevant.

I've only removed access controls, default user information, and comments. Everything else should be intact.

Comment 2 Jan Zeleny 2009-10-08 07:43:52 UTC
I tried to reproduce the issue with no success. Could you please provide me straces of slapd threads using following command?

strace -Fffxvto slapd.strace slapd -u ldap -h ldap:///

Just keep the strace running for enough time to see where and how are those files open (you can watch lsof -u ldap and after several new file descriptors to libnspr4.so are open you can end the tracing). Hopefully those straces will give us an idea where those descriptors are open.

Comment 3 Bryan Burke 2009-10-09 17:31:31 UTC
I've attached strace to an already-running slapd process with several of these files already open. If you'd like me to start a fresh process for the strace, just say so, and I'll do that instead.

Comment 4 Jan Zeleny 2009-10-12 07:36:51 UTC
Attaching strace to running process is fine, the important thing is the presence of at least one call opening the library in the strace. If it is present in your current strace, you can attach the result to this bug.

Comment 5 Bryan Burke 2009-10-15 17:52:33 UTC
Created attachment 364972 [details]
strace of one of the threads

This is the strace of one of the threads for the process. Note that there were 4 threads, but only two of them opened the libnspr4.so library. I'll be attaching the other strace momentarily.

Comment 6 Bryan Burke 2009-10-15 17:56:12 UTC
Created attachment 364973 [details]
The other thread's strace

Both this and the strace attachment immediately before it were attached for about 1.5-2 days. During that time the number of open file descriptors for that library went up by about 75. If you need any other help from me or would like me to look into anything else, just let me know. I looked at the strace's but don't have too much experience looking at them.

Comment 7 Bryan Burke 2009-10-15 17:57:09 UTC
I'd also like to apologize for how long it took to get those to you... I ran into some problems, but I think I've gotten everything now.

Comment 8 Jan Zeleny 2009-10-20 09:11:04 UTC
Thank you for valuable info. I'm working on it, but it may take some time, because I have some other work as well. I'll keep you posted.

Comment 9 Bryan Burke 2009-10-25 17:59:30 UTC
Ok, thanks for your help. If you need or would like me to help in any way, just let me know.

Comment 10 Jan Zeleny 2009-12-08 14:36:44 UTC
I need another set of straces. They should be the same as before, but this time could you please attach relevant part of syslog as well? Maybe not syslog (I'm not sure which log file it is), but I will need log which contains records like this:
<167>Oct 13 02:12:14 slapd[15457] .....
<167>Oct 13 02:12:15 slapd[15457] .....
....

Hopefully this will provide me some more valuable information. I might need debug output as well, but for now this should be enough, I will contact you in case more data is needed.

Comment 11 Bryan Burke 2009-12-15 20:07:32 UTC
Created attachment 378615 [details]
slapd traces and messages 

This is the output files for the strace. I used the same flags you gave me a while back. I've included all the files, and I ran a "tail -F /var/log/messages" when I started the strace and stopped it once I stopped the strace. I took a look though and there aren't any messages from slapd. We may need to turn debugging on, as you suggested....

Comment 12 Jan Zeleny 2009-12-16 08:44:56 UTC
Yes. I figured those log messages in strace are sent to systog daemon, but perhaps not ...

Try tu run the debug mode, I think -d 1 should do the trick. I need something which sould give me a link between the strace and function entry/exit points, so I can see where in the program the library is created. Also, I found -s option which could be fiven to strace, so longer string could be printed. It might be even more useful than debug mode. Check it out.

Comment 13 Tom Hughes 2010-01-06 14:37:09 UTC
I believe this is a bug in crypt() which I've just reported as bug #552917.

Comment 14 Bryan Burke 2010-01-06 16:15:14 UTC
I'll keep an eye on that one to see what progress is made. Thanks for your help!

Comment 15 Bug Zapper 2010-04-28 10:24:00 UTC
This message is a reminder that Fedora 11 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 11.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '11'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 11's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 11 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 16 Jan Zeleny 2010-05-27 11:52:14 UTC
Can you confirm this bug on F12 or F13? If yes, please change the Fedora version. Otherwise I'm going to close it.

Comment 17 Franky Van Liedekerke 2010-05-27 12:02:02 UTC
I can confirm this to be solved in F12

Comment 18 Jan Zeleny 2010-05-28 08:35:12 UTC
Perfect, thanks. I'm closing this bug then.


Note You need to log in before you can comment on or make changes to this bug.