Bug 857132 - Scale-Up operation fails with JBoss and Postgres
Scale-Up operation fails with JBoss and Postgres
Product: OpenShift Origin
Classification: Red Hat
Component: Containers (Show other bugs)
Unspecified Linux
medium Severity high
: ---
: ---
Assigned To: Bill DeCoste
libra bugs
Depends On:
  Show dependency treegraph
Reported: 2012-09-13 12:14 EDT by Skye Book
Modified: 2012-09-13 19:48 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2012-09-13 19:48:22 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
scale_events.log (1.54 MB, text/plain)
2012-09-13 13:42 EDT, Skye Book
no flags Details
Redeploying New Code Now Fails (2.46 KB, application/octet-stream)
2012-09-13 15:36 EDT, Skye Book
no flags Details
Local Gear's server.log (145.30 KB, application/octet-stream)
2012-09-13 15:46 EDT, Skye Book
no flags Details
Gear2 log (40.33 KB, text/plain)
2012-09-13 15:54 EDT, Skye Book
no flags Details

  None (edit)
Description Skye Book 2012-09-13 12:14:19 EDT
Description of problem:
On a scaled JBoss application using Postgres attempts to scale up a gear appear to fail consistently.  According to the logs, JBoss is unable to connect to the Postgres data store and show L7STS/404 under the LstChk column in the haproxy_status page.  My local/home gear continues to function normally, but the new gear is never started fully.

On recommendation from mmcgrath in IRC, I SSH'd into the failed slave gear and executed 'ctl_all restart'.  On completion, the gear was responding normally in haproxy_status and began to serve requests (and was connected successfully to Postgres).

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Run scaled application with Postgres database
2. Execute 'haproxy_ctld -u'
3. Monitor APP_URL/haproxy_status.  Before the haproxy_ctld command completes, the new gear will show as DOWN
Actual results:

Expected results:

Additional info:
Comment 1 Bill DeCoste 2012-09-13 13:39:12 EDT
Hi Skye,

Could you please attach the scale_events.log file from your haproxy gear?

Thanks -Bill
Comment 2 Skye Book 2012-09-13 13:42:53 EDT
Created attachment 612529 [details]

Attached is the scale_events log from my shared haproxy/app gear.
Comment 3 Skye Book 2012-09-13 15:36:52 EDT
Created attachment 612590 [details]
Redeploying New Code Now Fails

Now when pushing to the repository, the app is unable to deploy.  It hangs where the maven build should be starting.

rhc app restart fails.

rhc app force-stop succeeds.

rhc app start fails with cartridge error 121.

More in the attached log.
Comment 4 Skye Book 2012-09-13 15:39:03 EDT
Correction to my description, the initial "rhc app restart" says it has succeeded but does not actually work :)
Comment 5 Skye Book 2012-09-13 15:46:51 EDT
Created attachment 612610 [details]
Local Gear's server.log

Here is the server.log from the last hour or so on the local gear.
Comment 6 Skye Book 2012-09-13 15:54:03 EDT
Created attachment 612611 [details]
Gear2 log

Log from gear 2
Comment 7 Bill DeCoste 2012-09-13 17:07:41 EDT
The error and related stacktraces below are a known issue with database connections being dropped/invalidated between the application and database gears. JBoss should recover and create a new connection. The warning errors and stacktraces are just (annoying) noise

2012/09/13 11:43:26,714 WARN  [org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory] (http- Destroying connection that is not valid, due to the following exception: org.postgresql.jdbc4.Jdbc4Connection@23bd2a5d: org.postgresql.util.PSQLException: An I/O error occured while sending to the backend.
Comment 8 Bill DeCoste 2012-09-13 18:37:40 EDT

I'd like to close this bug and open another one for your second issue (rhc app restart failing) if that's alright?

Thanks -Bill
Comment 9 Skye Book 2012-09-13 19:44:50 EDT
(In reply to comment #8)
> Skye,
> I'd like to close this bug and open another one for your second issue (rhc
> app restart failing) if that's alright?
> Thanks -Bill

By all means, please do.  If you could just CC me on the new bug or something that would be great.

Note You need to log in before you can comment on or make changes to this bug.