Description of problem: after stopping the postgresql service, the api ping call responds with a pong while the webui correctly raises an error Version-Release number of selected component (if applicable): 5.9.0 How reproducible: all the time Steps to Reproduce: 1. stop postgres on the db / break database connectivity 2. connect to the appliance's ping page (https://appliance/ping) 3. connect to the appliance's api ping (https://appliance/api/ping) Actual results: the two pages do not behave the same way Expected results: the api ping will behave the same way as the older webui ping page Additional info:
The /ping endpoint hits a cache for the server timezone and when that cache expires after 5 minutes or if it was never cached, it tries to fetch the miq_servers row, which fails because the database is down. See [1], [2], [3] I believe both /api/ping and /ping should return PONG if they're accessible regardless of the database connectivity. Therefore, before_actions must work for the ping endpoint even if the db is down. /ping is a "I'm here" probe and [1] https://github.com/ManageIQ/manageiq-ui-classic/blob/0b11ca455aca9806cb772d464bba3cb6e99289db/app/controllers/application_controller.rb#L80 [2] https://github.com/ManageIQ/manageiq-ui-classic/blob/0b11ca455aca9806cb772d464bba3cb6e99289db/app/controllers/application_controller.rb#L889 [3] https://github.com/ManageIQ/manageiq-ui-classic/blob/0b11ca455aca9806cb772d464bba3cb6e99289db/app/controllers/application_controller/timezone.rb#L32 [----] I, [2018-03-08T17:00:19.014556 #2494:3f8740da1e14] INFO -- : Processing by PingController#index as */* [----] F, [2018-03-08T17:00:19.017215 #2494:3f8740da1e14] FATAL -- : Error caught: [PG::Error] invalid encoding name: utf8 /usr/local/lib/ruby/gems/2.3.0/gems/activerecord-5.0.6/lib/active_record/connection_adapters/postgresql_adapter.rb:685:in `set_client_encoding' /usr/local/lib/ruby/gems/2.3.0/gems/activerecord-5.0.6/lib/active_record/connection_adapters/postgresql_adapter.rb:685:in `configure_connection' /usr/local/lib/ruby/gems/2.3.0/gems/activerecord-5.0.6/lib/active_record/connection_adapters/postgresql_adapter.rb:255:in `reconnect!' /usr/local/lib/ruby/gems/2.3.0/gems/activerecord-5.0.6/lib/active_record/connection_adapters/abstract_adapter.rb:423:in `verify!' /usr/local/lib/ruby/gems/2.3.0/gems/activerecord-5.0.6/lib/active_record/connection_adapters/abstract/connection_pool.rb:778:in `block in checkout_and_verify' /usr/local/lib/ruby/gems/2.3.0/gems/activesupport-5.0.6/lib/active_support/callbacks.rb:126:in `call' /usr/local/lib/ruby/gems/2.3.0/gems/activesupport-5.0.6/lib/active_support/callbacks.rb:506:in `block (2 levels) in compile' /usr/local/lib/ruby/gems/2.3.0/gems/activesupport-5.0.6/lib/active_support/callbacks.rb:455:in `call' /usr/local/lib/ruby/gems/2.3.0/gems/activesupport-5.0.6/lib/active_support/callbacks.rb:101:in `__run_callbacks__' /usr/local/lib/ruby/gems/2.3.0/gems/activesupport-5.0.6/lib/active_support/callbacks.rb:750:in `_run_checkout_callbacks' /usr/local/lib/ruby/gems/2.3.0/gems/activerecord-5.0.6/lib/active_record/connection_adapters/abstract/connection_pool.rb:777:in `checkout_and_verify' /usr/local/lib/ruby/gems/2.3.0/gems/activerecord-5.0.6/lib/active_record/connection_adapters/abstract/connection_pool.rb:490:in `checkout' /usr/local/lib/ruby/gems/2.3.0/gems/activerecord-5.0.6/lib/active_record/connection_adapters/abstract/connection_pool.rb:364:in `connection' /usr/local/lib/ruby/gems/2.3.0/gems/activerecord-5.0.6/lib/active_record/connection_adapters/abstract/connection_pool.rb:883:in `retrieve_connection' /usr/local/lib/ruby/gems/2.3.0/gems/activerecord-5.0.6/lib/active_record/connection_handling.rb:128:in `retrieve_connection' /usr/local/lib/ruby/gems/2.3.0/gems/activerecord-5.0.6/lib/active_record/connection_handling.rb:91:in `connection' /usr/local/lib/ruby/gems/2.3.0/gems/activerecord-5.0.6/lib/active_record/core.rb:283:in `cached_find_by_statement' /usr/local/lib/ruby/gems/2.3.0/gems/activerecord-5.0.6/lib/active_record/core.rb:196:in `find_by' /var/www/miq/vmdb/app/models/miq_server.rb:547:in `block in <class:MiqServer>' /usr/local/lib/ruby/gems/2.3.0/gems/more_core_extensions-3.5.0/lib/more_core_extensions/core_ext/module/cache_with_timeout.rb:61:in `block (2 levels) in cache_with_timeout' /usr/local/lib/ruby/2.3.0/sync.rb:234:in `block in sync_synchronize' /usr/local/lib/ruby/2.3.0/sync.rb:231:in `handle_interrupt' /usr/local/lib/ruby/2.3.0/sync.rb:231:in `sync_synchronize' /usr/local/lib/ruby/gems/2.3.0/gems/more_core_extensions-3.5.0/lib/more_core_extensions/core_ext/module/cache_with_timeout.rb:51:in `block in cache_with_timeout' /usr/local/lib/ruby/gems/2.3.0/bundler/gems/manageiq-ui-classic-0928245eca8f/app/controllers/application_controller/timezone.rb:32:in `server_timezone' /usr/local/lib/ruby/gems/2.3.0/bundler/gems/manageiq-ui-classic-0928245eca8f/app/controllers/application_controller.rb:888:in `set_user_time_zone' /usr/local/lib/ruby/gems/2.3.0/gems/activesupport-5.0.6/lib/active_support/callbacks.rb:382:in `block in make_lambda' /usr/local/lib/ruby/gems/2.3.0/gems/activesupport-5.0.6/lib/active_support/callbacks.rb:169:in `block (2 levels) in halting' /usr/local/lib/ruby/gems/2.3.0/gems/actionpack-5.0.6/lib/abstract_controller/callbacks.rb:12:in `block (2 levels) in <module:Callbacks>' /usr/local/lib/ruby/gems/2.3.0/gems/activesupport-5.0.6/lib/active_support/callbacks.rb:170:in `block in halting' /usr/local/lib/ruby/gems/2.3.0/gems/activesupport-5.0.6/lib/active_support/callbacks.rb:454:in `block in call' /usr/local/lib/ruby/gems/2.3.0/gems/activesupport-5.0.6/lib/active_support/callbacks.rb:454:in `each' /usr/local/lib/ruby/gems/2.3.0/gems/activesupport-5.0.6/lib/active_support/callbacks.rb:454:in `call' /usr/local/lib/ruby/gems/2.3.0/gems/activesupport-5.0.6/lib/active_support/callbacks.rb:101:in `__run_callbacks__' /usr/local/lib/ruby/gems/2.3.0/gems/activesupport-5.0.6/lib/active_support/callbacks.rb:750:in `_run_process_action_callbacks' /usr/local/lib/ruby/gems/2.3.0/gems/activesupport-5.0.6/lib/active_support/callbacks.rb:90:in `run_callbacks' /usr/local/lib/ruby/gems/2.3.0/gems/actionpack-5.0.6/lib/abstract_controller/callbacks.rb:19:in `process_action' /usr/local/lib/ruby/gems/2.3.0/gems/actionpack-5.0.6/lib/action_controller/metal/rescue.rb:20:in `process_action' /usr/local/lib/ruby/gems/2.3.0/gems/actionpack-5.0.6/lib/action_controller/metal/instrumentation.rb:32:in `block in process_action'
Continued... /ping is a "I'm here" probe and shouldn't directly or indirectly testing database connectivity. We can expose an endpoint for that if desired.
Reassigned to UI group to allow /ping to work without database connectivity
So the loadbalancers use the /ping to check if the application is ready to server requests for the user, right? And the UI is not able to serve any requests then it does not have the database. Right? If the loadbalancer checks the /ping and it gets the response it starts directing requests to the app. Right? But if the database is offline we cannot serve any requests. Can we? My conclusion is that the current behavior is the usefull one and changing it will actually make in not usefull. I don't want to make things not useful to match behavior of the API. That does not make any sense to me. Is there a usecase where the behavior that JoeR suggest would actually be useful? Is there any docs on this? p.s. Think of a ticket office that has a sign "Open". A customer comes and wants to buy a ticket. The clerk answers: We are open but we do not provide any services today. We do not sell tickets, we do not return money. But right, as the sign reads: We are "Open".
I can see the usefulness of the current implementation and agree we can decide on a finer grain ping endpoint for subsystems when we need them. Even if /ping and /api/ping differ in their responses in this situation, the problem is that /ping checks the database "sometimes" and ignores it others. As it is now, within the first X minutes you can hit the /ping endpoint (with the db down) and it will succeed with pong until the cache_with_timeout on MiqServer.my_server expires and it tries to issue a query. This is why I assumed it wasn't intentionally testing the database. If need be, I can open a new BZ since the current behavior led me down the wrong path. We can leave this BZ as: /ping to test ui availability (also checks database connectivity) /api/ping to test api worker is accepting connections (does not check db) But anyway, here's how I tested it: 1) bin/rails s 2) curl -k http:/127.0.0.1:3000/ping => pong 3) stop postgresql 4) curl -k http:/127.0.0.1:3000/ping => pong 5) After 5 minutes or so: curl -k http:/127.0.0.1:3000/ping => throw "error"; $("#center_div").html("\u003cdiv id='content' style='overflow: scroll; height: 100%; padding-left: 50px;'\u003e\n\u003ch1 style='color: #c90813'\u003e\nUnexpected error encountered\n\u003c/h1\u003e\n\u003cbr\u003e\n\u003ch2\u003e\nErrors in Management Engine can be caused by:\n\u003c/h2\u003e\n\u003cbr\u003e\n\u003cfieldset\u003e\n\u003cfont style='font-size: 116%;color: black'\u003e\u0026#x2460;\u003c/font\u003e\nAccessing Management Engine from multiple tabs ...
David is taking this one right after he's back from Bath. I will help if needed. The plan is to get this resolved by the start of the next week.
https://github.com/ManageIQ/manageiq-ui-classic/pull/3680
New commit detected on ManageIQ/manageiq-ui-classic/master: https://github.com/ManageIQ/manageiq-ui-classic/commit/5a7fb6e883331e86b93625c2eb8ae01fd136f79e commit 5a7fb6e883331e86b93625c2eb8ae01fd136f79e Author: Dávid Halász <dhalasz> AuthorDate: Mon Mar 26 08:10:44 2018 -0400 Commit: Dávid Halász <dhalasz> CommitDate: Mon Mar 26 08:10:44 2018 -0400 Fail with an error on /ping immediately if DB is not accessible Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1553223 app/controllers/ping_controller.rb | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-)
Note, for clarification, we should document that the rest api and ui both have ping endpoints for different use cases: rest api: /api/ping ui: /ping The rest api ping, /api/ping, is only testing if the rest api will accept connections. The ui ping, /ping, will also test database connectivity in addition to a simple normal UI request. This endpoint should be used for testing if the UI is "up".
Hi Ievgen, Sorry for the lack of clarity. I have updated the title of this bugzilla to reflect the change fixed in this bugzilla: Due to caching, the UI /ping endpoint is inconsistent: first passes, then fails after 5 minutes... while the database is down. In comment #11, there is still a difference between the rest api and ui ping endpoint as they have different use cases and we are NOT making these two endpoints (ui and api's ping) the same. See that comment for details. For testing this fix made in this bugzilla: From comment #7: "As it is now, within the first X minutes you can hit the /ping endpoint (with the db down) and it will succeed with pong until the cache_with_timeout on MiqServer.my_server expires and it tries to issue a query" Here's how I tested it: 1) bin/rails s 2) curl -k http:/127.0.0.1:3000/ping => pong 3) stop postgresql 4) curl -k http:/127.0.0.1:3000/ping => pong 5) After 5 minutes or so: curl -k http:/127.0.0.1:3000/ping => throw "error"; $("#center_div").html("\u003cdiv id='content' style='overflow: scroll; height: 100%; padding-left: 50px;'\u003e\n\u003ch1 style='color: #c90813'\u003e\nUnexpected error encountered\n\u003c/h1\u003e\n\u003cbr\u003e\n\u003ch2\u003e\nErrors in Management Engine can be caused by:\n\u003c/h2\u003e\n\u003cbr\u003e\n\u003cfieldset\u003e\n\u003cfont style='font-size: 116%;color: black'\u003e\u0026#x2460;\u003c/font\u003e\nAccessing Management Engine from multiple tabs ... "
Hi Joe, Thanks for update. Verified in 5.10.0.32