Bug 470870 - condor_schedd running out of file descriptors
condor_schedd running out of file descriptors
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: grid (Show other bugs)
All Linux
high Severity urgent
: 1.1
: ---
Assigned To: Ted Ross
Kim van der Riet
Depends On:
  Show dependency treegraph
Reported: 2008-11-10 13:08 EST by Matthew Farrellee
Modified: 2009-02-04 11:05 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2009-02-04 11:05:00 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Matthew Farrellee 2008-11-10 13:08:05 EST
Description of problem:

The condor_schedd runs out of file descriptors and EXCEPTs, via dprintf.

Version-Release number of selected component (if applicable):

7.0.4-0.4 (with qmf-plugins)

How reproducible:


Steps to Reproduce:
1. run the condor_schedd with plugins enabled
2. submit 512 jobs
3. remove 512 jobs

Actual results:

condor_schedd EXCEPTs with a message in the SchedLog: **** PANIC -- OUT OF FILE DESCRIPTORS at line 783 in dprintf.c

Expected results:

A working schedd...

Additional info:

This appears to only happen when the QMF plugins are loaded.

In such a case /proc/<schedd pid>/fd can be ls'd to see many sockets open. Also lsof | grep <schedd pid> lists the sockets.
Comment 1 Matthew Farrellee 2008-11-10 13:15:22 EST
qpidc is r711740
Comment 2 Ted Ross 2008-11-10 13:25:21 EST
I can reproduce a similar symptom when running the example qmf-agent with no
available broker.

It appears that when the connection fails (connection-refused), the FD is not
reclaimed and is not reused.
Comment 3 Ted Ross 2008-11-10 13:41:46 EST
More specific information:

In the c++ client, when Connection.open() fails (i.e. throws and exception), it appears to leak a file descriptor.

Calling Connection.close() in the exception handler does not solve the problem.
Comment 4 Ted Ross 2008-11-11 15:18:36 EST
To verify, run the qmf-agent example with no running broker.  qmf-agent will continually attempt to connect to the broker.  Use "/usr/sbin/lsof | grep qmf" to see if there are an increasing number of file descriptors allocated to the qmf-agent process.  The number of FDs should remain constant.
Comment 7 errata-xmlrpc 2009-02-04 11:05:00 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.