Created attachment 786176 [details] nodejs error log Description of problem: After creating a nodejs application (scalable or non-scalable) with a postgresql cartridge (8.4 or 9.2) attached, the postgresql cartridge does not restart properly after a 'stop'. That is to say, after the cartridge is restarted, when the app url is accessed, a 503 error is returned, but the rhc status of the app returns 'running.' I have attached the error logs from the app. Version-Release number of selected component (if applicable): devenv_3641 (ami-fee7a197) How reproducible: Always Steps to Reproduce: 1. Create a nodejs app and attach a postgresql cartridge 2. Stop the postgresql cartridge 3. Restart the postgrestql cartridge Actual results: The cartridge shows "running" but a 503 error is returned from the url, as well, the error "`sh "-c" "node server.js"` failed with 1' was found in the logs Expected results: The cartridge is restarted without issue. Additional info:
The bare application doesn't seem to have this problem. Does your application try to make connection to the PostgreSQL server? I can't tell from the logs what is refusing the connection to cause the failure. $ bx bin/rhc app create foo nodejs-0.6 postgresql-9.2 Application Options ------------------- Namespace: fooooooooooo Cartridges: nodejs-0.6, postgresql-9.2 Gear Size: default Scaling: no Creating application 'foo' ... done ⋮ ⋮ Your application 'foo' is now available. URL: http://foo-fooooooooooo.dev.rhcloud.com/ SSH to: 621273939073518031339520.rhcloud.com Git remote: ssh://621273939073518031339520.rhcloud.com/~/git/foo.git/ Cloned to: /Users/asari/Development/src/rht/rhc/foo Run 'rhc show-app foo' for more details about your app. $ bx bin/rhc cartridge stop postgresql-9.2 -a foo Stopping postgresql-9.2 ... done $ bx bin/rhc app restart foo RESULT: foo restarted $ curl -IL http://foo-fooooooooooo.dev.rhcloud.com/ HTTP/1.0 200 OK Date: Tue, 13 Aug 2013 15:12:31 GMT X-Powered-By: Express Content-Type: text/html; charset=UTF-8 Content-Length: 5235 Vary: Accept-Encoding,User-Agent ProxyTime: D=4707 X-Cache: MISS from file01.intranet.prod.int.rdu2.redhat.com X-Cache-Lookup: MISS from file01.intranet.prod.int.rdu2.redhat.com:8080 Via: 1.0 file01.intranet.prod.int.rdu2.redhat.com (squid/3.1.10) Connection: keep-alive
Yes, the application makes a connection to the postgresql server. Please reference the code at https://github.com/cjryan/nodejs-bughunting. Particularly, the postgresql_factory.js, server.js, and package.json files.
To reproduce this error using the application, do: rhc app create foo nodejs-0.6 postgresql-9.2 --from-code https://github.com/cjryan/nodejs-bughunting.git rhc cartridge stop postgresql-9.2 -a foo rhc cartridge restart nodejs-0.6 -a foo curl -IL http://foo-*/ After this, I see that individual cartridges report 'down' status, but 'rhc app show --state' reports 'up'. $ bx bin/rhc cartridge status nodejs-0.6 -a foo RESULT: Application is not running $ bx bin/rhc cartridge status postgresql-9.2 -a foo RESULT: Postgres is stopped $ bx bin/rhc app show foo --state Cartridge nodejs-0.6, postgresql-9.2 is started
After stopping postgresql cartridge, you need to access '/postgresql' to trigger the error and status 503. Thus the more accurate procedure is: rhc app create foo nodejs-0.6 postgresql-9.2 --from-code \ https://github.com/cjryan/nodejs-bughunting.git rhc cartridge stop postgresql-9.2 -a foo rhc cartridge restart nodejs-0.6 -a foo curl -IL http://foo-*/postgresql curl -IL http://foo-*/ Then: $ bx bin/rhc cartridge status nodejs-0.6 -a foo RESULT: Application is not running $ bx bin/rhc cartridge status postgresql-9.2 -a foo RESULT: Postgres is not running $ bx bin/rhc app show foo --state Cartridge nodejs-0.6, postgresql-9.2 is started
The CLI ticket is Bug 996713.
'app show --state' uses broker's response from the REST URL, e.g., https://ec2-23-22-236-121.compute-1.amazonaws.com/broker/rest/domains/fooooooooooo/applications/foo/gear_groups If the primary web framework cartridge dies, as it does here, the broker fails to pick up the correct status of the gear group. (In other words, if the primary web framework cartridge is stopped correctly--say, via 'cartridge stop'--the broker updates the status correctly. On the other hand, it may be more desirable for 'rhc' to report each cartridge's status for 'app state --show'. (This is Bug 996713.) I'm sending this to the broker team for further review.
State is what is intended for the application, while status is per cartridge. It is expected that a failing application could have a state of started while a status of not running for the web framework. Hiro, on restart is the node.js code pooling or waiting on the database connection? As a test, stop the application, start the postgres cartridge, then start the node.js cartridge. Is that successful?
I'm going to confine the discussion to how this example application that Chris created. When the application is restarted, node.js process is running; 'cartridge state' returns running. I guess 'waiting' is the closer of the two alternatives given; there is nothing that really waits. But only when '/postgresql' is accessed, is the connection attempt made. And the process dies. Yay for callbacks. The stop-the-app, start-the-db, start-the-nodejs flow results in: $ bx bin/rhc app stop foo RESULT: foo stopped $ bx bin/rhc app show foo --state Cartridge nodejs-0.6, postgresql-9.2 is stopped $ bx bin/rhc cartridge start postgresql-9.2 -a foo Starting postgresql-9.2 ... done $ bx bin/rhc app show foo --state Cartridge nodejs-0.6, postgresql-9.2 is stopped $ bx bin/rhc cartridge start nodejs-0.6 -a foo Starting nodejs-0.6 ... done $ bx bin/rhc app show foo --state Cartridge nodejs-0.6, postgresql-9.2 is started So, at some point, the broker consults the cartridges to figure out that the application's state should be 'started'. It is just not happening when the process dies unexpectedly.
Not sure if this will influence much, but a slight addendum to the procedure. This is how we were originally testing to reproduce this bug: rhc app create foo nodejs-0.6 postgresql-9.2 --from-code https://github.com/cjryan/nodejs-bughunting.git rhc cartridge stop postgresql-9.2 -a foo rhc cartridge restart postgresql-9.2 -a foo curl -IL http://foo-*/ That is, only the database cartridge is stopped/restarted, not the app. Thanks!
As this discrepancy is by design and documented, I'm closing this as NOTABUG.