Description of problem: Once I start up all of the Spacewalk services all of the available PostgreSQL database connections will be exhausted within an hour. I have expanded that limit to 400 however that hasn't shown to help at all. Version-Release number of selected component (if applicable): Spacewalk 1.7, PostgreSQL 8.4 How reproducible: Everytime I restart the services this issue reoccurs Steps to Reproduce: 1. /etc/init.d/postgresql restart 2. spacewalk-service restart 3. Wait an hour and the available database connections will be exhausted Actual results: Available database connections are exhausted Expected results: The Spacewalk applications should close database connections properly Additional info: It seems like other users are experiencing the same issue as seen in the "Spacewalk 1.7 Postgres out of connections" mailing list thread. Please let me know if you need me to provide any files or other pieces of information. Thanks!
You say Available database connections are exhausted What is the exact action / situation and what is the exact error you get?
It seems after a clean restart Spacewalk will consume all available database connections no matter what is going on (I usually just restart the services, leave it, and then watch the database connection count sky rocketing via psql). I have tried increasing the max database connection setting in PostgreSQL but it will still use all of them too. The first sign of this failure is that osa-dispatcher will die an hour or so after the clean restart. At this point I cannot login to the PostgreSQL database via psql because it complains that there are no available connections. If you need screenshots let me know and I should be able to grab some for you when I get home tonight. Regards, Dan
So, you say that the error is that osa-dispatcher dies. If you stop osa-dispatcher, will the other services (tomcat, httpd, taskomatic, search server) run undisrupted? In other words, is the osa-dispatcher the service causing the problem?
I've got the same issue on my PRD Spacewalk instance Spacewalk 1.7 on CentOS x86_64 Postgresql It's happening less frequently for me. Every 2-6 days it exhausts the available Postgres DB connections. The issue for me started about 3 weeks after 1.7 was released. We upgraded from 1.6 -> 1.7 with a couple of days of the announcement. I've been restarting Spacewalk to clear the condition. I haven't seen OSA dispatcher die but to be honest I haven't checked - I've just been restarting all of Spacewalk because it tends to happen after hours so haven't spent much time investigating yet. Two other things going on with osad and osa-dispatcher that are likely related to this DB connection issue. osad agents stopped connecting about the same time that this DB issue cropped up. They were working immediately after the 1.7 upgrade but they hang for a while and then timeout during startup now. I don't rely heavily on osad (because it's pretty flaky) so I haven't investigated this yet. osa-dispatcher startup times seemed to deteriorate over time to the point where it was taking several minutes to startup. I "solved" this by nuking the jabber DB and restarting spacewalk. I did that last week and since then osa-dispatcher startup times seem to be normal (i.e., nearly instant). That strongly suggests that something is looping and creating new jabber entities. Not sure how to check that. I can try running without osa-dispatcher for a while to see if it affects the behavior but it would take a week or more to convince me that the issue is solved since it's fairly infrequent.
I just did a clean restart of both the Spacewalk services & PostgreSQL and here is the DB connection count so far: Sat May 19 01:47:38 CDT 2012 - 48 DB connections (clean restart happened just a few minutes before this was recorded) Sat May 19 02:42:47 CDT 2012 - 195 DB connections My max connections is set to 200 (due to the amount of memory I have on the box) so I will use up the connections within a few minutes.
Here is the reoccurring error message that now appears in the postgresql-<day>.log file: FATAL: sorry, too many clients already
After restarting all of the Spacewalk services and then stopping the "osa-dispatcher" service, I am only at 90 consumed connections a day later. Regards, Dan
After a few days I still eventually ran out of DB connections so I will now disable the Jabber services and see if they are truly the culprits.
Well, on https://fedorahosted.org/spacewalk/wiki/OracleXeSetup says alter system set processes = 400 ... so as for PostgreSQL we very likely need to adjust number of connections as well. Could you try to set max_connections = 600 in /var/lib/pgsql/data/postgresql.conf + service postgresql restart? (In PostgreSQL we use db_link for autonomous transaction so it should allow even higher number of connections than Oracle.)
https://fedorahosted.org/spacewalk/wiki/PostgreSQLServerSetup has been modified and recommends increasing number of connections (max_connections = 600).
Moving ON_QA. Packages that address this bugzilla should now be available in yum repos at http://yum.spacewalkproject.org/nightly/
Spacewalk 1.8 has been released: https://fedorahosted.org/spacewalk/wiki/ReleaseNotes18