Bug 1268816 - RHQ server connection leak
Summary: RHQ server connection leak
Keywords:
Status: VERIFIED
Alias: None
Product: RHQ Project
Classification: Other
Component: Core Server
Version: unspecified
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ---
: RHQ 4.14
Assignee: Thomas Heute
QA Contact: Filip Brychta
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-10-05 11:29 UTC by Filip Brychta
Modified: 2015-11-02 00:47 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:


Attachments (Terms of Use)

Description Filip Brychta 2015-10-05 11:29:52 UTC
Description of problem:
There is a lot of established connections from RHQ server to cassandra (port 9142) which results in many open files -> given user is hitting ulimits -> e.g. ssh to the machine using given user is not working.

Version-Release number of selected component (if applicable):
4.14.0-SNAPSHOT

How reproducible:
Always

Steps to Reproduce:
1. install and start the rhq (rhqctl install, rhqctl start)
2. import resources and keep it running
3. check periodically number of open files for rhq server (lsof -p <rhqServerPID> | wc -l)


Actual results:
Number of open files for RHQ server process is increasing until it hits ulimit for given user.

Expected results:
Number of open files is not increasing in time

Additional info:
After clean installation there is 1202 open files (lsof -p <serverPID> | wc -l)
This number is still increasing so in a few hours ulimit for open files is hit.
No exceptions in server.log
This issue was introduced in following rhq build http://hudson.qa.jboss.com/hudson/view/RHQ/job/rhq-master-gwt-locales/1570/

Comment 1 Libor Zoubek 2015-10-05 13:43:32 UTC
Fixed

branch:  master
link:    https://github.com/rhq-project/rhq/commit/7fb9222c8
time:    2015-10-05 15:41:16 +0200
commit:  7fb9222c80981fb876d8a7eea472304761f42555
author:  Libor Zoubek - lzoubek@redhat.com
message: Bug 1234912 - Do not authenticate against new storage node when
         replication_factor of system_auth keyspace is wrong

         Correctly close storage cluster session and fix scheduling
         interval of job


Note You need to log in before you can comment on or make changes to this bug.