Bug 822326 - Database keeps running out of connections
Summary: Database keeps running out of connections
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Spacewalk
Classification: Community
Component: Server
Version: 1.7
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Michael Mráka
QA Contact: Red Hat Satellite QA List
URL:
Whiteboard:
Depends On:
Blocks: space18
TreeView+ depends on / blocked
 
Reported: 2012-05-17 05:13 UTC by Dan Burkland
Modified: 2012-11-01 16:19 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-11-01 16:19:15 UTC
Embargoed:


Attachments (Terms of Use)

Description Dan Burkland 2012-05-17 05:13:11 UTC
Description of problem: Once I start up all of the Spacewalk services all of the available PostgreSQL database connections will be exhausted within an hour. I have expanded that limit to 400 however that hasn't shown to help at all. 


Version-Release number of selected component (if applicable): Spacewalk 1.7, PostgreSQL 8.4


How reproducible: Everytime I restart the services this issue reoccurs


Steps to Reproduce:
1. /etc/init.d/postgresql restart
2. spacewalk-service restart
3. Wait an hour and the available database connections will be exhausted
  
Actual results: Available database connections are exhausted


Expected results: The Spacewalk applications should close database connections properly


Additional info: It seems like other users are experiencing the same issue as seen in the "Spacewalk 1.7 Postgres out of connections" mailing list thread. Please let me know if you need me to provide any files or other pieces of information. Thanks!

Comment 1 Jan Pazdziora 2012-05-17 07:26:43 UTC
You say

   Available database connections are exhausted

What is the exact action / situation and what is the exact error you get?

Comment 2 Dan Burkland 2012-05-17 16:39:16 UTC
It seems after a clean restart Spacewalk will consume all available database connections no matter what is going on (I usually just restart the services, leave it, and then watch the database connection count sky rocketing via psql). I have tried increasing the max database connection setting in PostgreSQL but it will still use all of them too. The first sign of this failure is that osa-dispatcher will die an hour or so after the clean restart. At this point I cannot login to the PostgreSQL database via psql because it complains that there are no available connections. If you need screenshots let me know and I should be able to grab some for you when I get home tonight.

Regards,

Dan

Comment 3 Jan Pazdziora 2012-05-18 07:32:54 UTC
So, you say that the error is that osa-dispatcher dies. If you stop osa-dispatcher, will the other services (tomcat, httpd, taskomatic, search server) run undisrupted? In other words, is the osa-dispatcher the service causing the problem?

Comment 4 Tony Coffman 2012-05-18 19:08:34 UTC
I've got the same issue on my PRD Spacewalk instance

Spacewalk 1.7 on CentOS x86_64 Postgresql


It's happening less frequently for me.  Every 2-6 days it exhausts the available Postgres DB connections.

The issue for me started about 3 weeks after 1.7 was released.  We upgraded from 1.6 -> 1.7 with a couple of days of the announcement.

I've been restarting Spacewalk to clear the condition.  I haven't seen OSA dispatcher die but to be honest I haven't checked - I've just been restarting all of Spacewalk because it tends to happen after hours so haven't spent much time investigating yet.

Two other things going on with osad and osa-dispatcher that are likely related to this DB connection issue.

osad agents stopped connecting about the same time that this DB issue cropped up.  They were working immediately after the 1.7 upgrade but they hang for a while and then timeout during startup now.  I don't rely heavily on osad (because it's pretty flaky) so I haven't investigated this yet.

osa-dispatcher startup times seemed to deteriorate over time to the point where it was taking several minutes to startup.  I "solved" this by nuking the jabber DB and restarting spacewalk.  I did that last week and since then osa-dispatcher  startup times seem to be normal (i.e., nearly instant).

That strongly suggests that something is looping and creating new jabber entities.  Not sure how to check that.

I can try running without osa-dispatcher for a while to see if it affects the behavior but it would take a week or more to convince me that the issue is solved since it's fairly infrequent.

Comment 5 Dan Burkland 2012-05-19 07:49:44 UTC
I just did a clean restart of both the Spacewalk services & PostgreSQL and here is the DB connection count so far:

Sat May 19 01:47:38 CDT 2012 - 48 DB connections (clean restart happened just a few minutes before this was recorded)
Sat May 19 02:42:47 CDT 2012 - 195 DB connections

My max connections is set to 200 (due to the amount of memory I have on the box) so I will use up the connections within a few minutes.

Comment 6 Dan Burkland 2012-05-19 07:50:53 UTC
Here is the reoccurring error message that now appears in the postgresql-<day>.log file:

FATAL:  sorry, too many clients already

Comment 7 Dan Burkland 2012-05-20 23:12:24 UTC
After restarting all of the Spacewalk services and then stopping the "osa-dispatcher" service, I am only at 90 consumed connections a day later. 

Regards,

Dan

Comment 8 Dan Burkland 2012-05-23 03:28:10 UTC
After a few days I still eventually ran out of DB connections so I will now disable the Jabber services and see if they are truly the culprits.

Comment 9 Michael Mráka 2012-06-04 09:29:13 UTC
Well, on https://fedorahosted.org/spacewalk/wiki/OracleXeSetup says 
 alter system set processes = 400 ...
so as for PostgreSQL we very likely need to adjust number of connections as well.
Could you try to set
 max_connections = 600
in /var/lib/pgsql/data/postgresql.conf + service postgresql restart? (In PostgreSQL we use db_link for autonomous transaction so it should allow even higher number of connections than Oracle.)

Comment 10 Michael Mráka 2012-06-29 07:45:01 UTC
https://fedorahosted.org/spacewalk/wiki/PostgreSQLServerSetup has been modified and recommends increasing number of connections (max_connections = 600).

Comment 11 Jan Pazdziora 2012-10-30 19:24:01 UTC
Moving ON_QA. Packages that address this bugzilla should now be available in yum repos at http://yum.spacewalkproject.org/nightly/

Comment 12 Jan Pazdziora 2012-11-01 16:19:15 UTC
Spacewalk 1.8 has been released: https://fedorahosted.org/spacewalk/wiki/ReleaseNotes18


Note You need to log in before you can comment on or make changes to this bug.