Bug 753728 - ISE when postgresql is restarted and not spacewalk
Summary: ISE when postgresql is restarted and not spacewalk
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Spacewalk
Classification: Community
Component: Server
Version: 1.5
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Michael Mráka
QA Contact: Red Hat Satellite QA List
URL:
Whiteboard:
Depends On:
Blocks: space16
TreeView+ depends on / blocked
 
Reported: 2011-11-14 10:10 UTC by pierre.casenove
Modified: 2011-12-22 16:49 UTC (History)
1 user (show)

Fixed In Version: spacewalk-java-1.6.98-1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-12-22 16:49:39 UTC
Embargoed:


Attachments (Terms of Use)
TaskOMatic log (134.89 KB, application/octet-stream)
2011-11-14 10:11 UTC, pierre.casenove
no flags Details
Tomcat log (196.43 KB, application/octet-stream)
2011-11-14 10:12 UTC, pierre.casenove
no flags Details

Description pierre.casenove 2011-11-14 10:10:58 UTC
Description of problem:
If postgresql is restarted while tomcat is up and running, the connection to the database is never recovered.
Attached files: catalina.out and rhn_taskomatic_daemon.log


Version-Release number of selected component (if applicable): 1.5


How reproducible: always


Steps to Reproduce:
1. go to https://myserver/rhn and log on
2. service postgresql restart
3. refresh the page --> Error 500
  
Actual results:
error 500


Expected results:
Spacewalk should recover automatically the connection to the DB.


Additional info:
it SEEMS that osa-dispatcher reconnects correctly (not sure by reading the logs)

Comment 1 pierre.casenove 2011-11-14 10:11:32 UTC
Created attachment 533483 [details]
TaskOMatic log

Comment 2 pierre.casenove 2011-11-14 10:12:03 UTC
Created attachment 533484 [details]
Tomcat log

Comment 3 pierre.casenove 2011-11-14 13:01:27 UTC
Additional tests:
In file /etc/rhn/default/rhn_hibernate.conf, the following parameter is set:
# test period value in seconds
hibernate.c3p0.idle_test_period=300

I just made a test: 5 minutes (300 seconds) after postgresql restart, the webui is functional again. But TaskOMatic is still failing with the same error and needs a restart. It seems thant TaskOmatic never flushes its connection. Does it uses c3p0 connection pool?

From my point of view, idle_test_period value should be decreased. 5 minutes is quite a long period.

Comment 4 Michael Mráka 2011-12-20 10:00:59 UTC
This issue should be fixed by 
commit 2d41929a4ae4fc62f6b4c46a77d8b00f6972269a
    753728 - test database connection prior running query

Comment 5 pierre.casenove 2011-12-20 10:08:51 UTC
Should parameter testConnectionOnCheckout really be set? Hibernate don't recommend to use this.
Must be set in c3p0.properties, C3P0 default: false
Don't use it, this feature is very expensive. If set to true, an operation will be performed at every connection checkout to verify that the connection is valid. A better choice is to verify connections periodically using c3p0.idleConnectionTestPeriod.

Comment 6 Michael Mráka 2011-12-20 13:13:43 UTC
It's 5ms per request which is not that expensive. Periodic connection test either won't eliminate ISE (in case of long timeout) or will overload server (short timeout).

Comment 7 Milan Zázrivec 2011-12-22 16:49:39 UTC
Spacewalk 1.6 has been released.


Note You need to log in before you can comment on or make changes to this bug.