753728 – ISE when postgresql is restarted and not spacewalk

Bug 753728 - ISE when postgresql is restarted and not spacewalk

Summary: ISE when postgresql is restarted and not spacewalk

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Spacewalk
Classification:	Community
Component:	Server
Sub Component:
Version:	1.5
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Assignee:	Michael Mráka
QA Contact:	Red Hat Satellite QA List
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	space16
TreeView+	depends on / blocked

Reported:	2011-11-14 10:10 UTC by pierre.casenove
Modified:	2011-12-22 16:49 UTC (History)
CC List:	1 user (show)
Fixed In Version:	spacewalk-java-1.6.98-1
Clone Of:
Environment:
Last Closed:	2011-12-22 16:49:39 UTC
Embargoed:

Attachments	(Terms of Use)
TaskOMatic log (134.89 KB, application/octet-stream) 2011-11-14 10:11 UTC, pierre.casenove	no flags	Details
Tomcat log (196.43 KB, application/octet-stream) 2011-11-14 10:12 UTC, pierre.casenove	no flags	Details
View All

Description pierre.casenove 2011-11-14 10:10:58 UTC

Description of problem:
If postgresql is restarted while tomcat is up and running, the connection to the database is never recovered.
Attached files: catalina.out and rhn_taskomatic_daemon.log


Version-Release number of selected component (if applicable): 1.5


How reproducible: always


Steps to Reproduce:
1. go to https://myserver/rhn and log on
2. service postgresql restart
3. refresh the page --> Error 500
  
Actual results:
error 500


Expected results:
Spacewalk should recover automatically the connection to the DB.


Additional info:
it SEEMS that osa-dispatcher reconnects correctly (not sure by reading the logs)

Comment 1 pierre.casenove 2011-11-14 10:11:32 UTC

Created attachment 533483 [details]
TaskOMatic log

Comment 2 pierre.casenove 2011-11-14 10:12:03 UTC

Created attachment 533484 [details]
Tomcat log

Comment 3 pierre.casenove 2011-11-14 13:01:27 UTC

Additional tests:
In file /etc/rhn/default/rhn_hibernate.conf, the following parameter is set:
# test period value in seconds
hibernate.c3p0.idle_test_period=300

I just made a test: 5 minutes (300 seconds) after postgresql restart, the webui is functional again. But TaskOMatic is still failing with the same error and needs a restart. It seems thant TaskOmatic never flushes its connection. Does it uses c3p0 connection pool?

From my point of view, idle_test_period value should be decreased. 5 minutes is quite a long period.

Comment 4 Michael Mráka 2011-12-20 10:00:59 UTC

This issue should be fixed by 
commit 2d41929a4ae4fc62f6b4c46a77d8b00f6972269a
    753728 - test database connection prior running query

Comment 5 pierre.casenove 2011-12-20 10:08:51 UTC

Should parameter testConnectionOnCheckout really be set? Hibernate don't recommend to use this.
Must be set in c3p0.properties, C3P0 default: false
Don't use it, this feature is very expensive. If set to true, an operation will be performed at every connection checkout to verify that the connection is valid. A better choice is to verify connections periodically using c3p0.idleConnectionTestPeriod.

Comment 6 Michael Mráka 2011-12-20 13:13:43 UTC

It's 5ms per request which is not that expensive. Periodic connection test either won't eliminate ISE (in case of long timeout) or will overload server (short timeout).

Comment 7 Milan Zázrivec 2011-12-22 16:49:39 UTC

Spacewalk 1.6 has been released.

Note You need to log in before you can comment on or make changes to this bug.