Description of problem: When CloudForms appliance base version is 4.0, and then upgrade to 4.1 >> 4.2, and then 4.5 Embedded ansible fails to start. Version-Release number of selected component (if applicable): 4.5 How reproducible: Deploy 4.0 appliance Steps to Reproduce: 1. upgrade to 4.1 2. upgrade to 4.2 3. upgrade to 4.5, and enable Embedded ansible server role. Actual results: The wroker stuck in creating status EmbeddedAnsibleWorker | creating | 1000000637582 | | | 1000000000001 | | Or [----] E, [2017-08-18T16:17:43.712452 #1621:933140] ERROR -- : MIQ(MiqServer#validate_worker) Worker [EmbeddedAnsibleWorker] with ID: [1000000000056], PID: [], GUID: [77453902-844f-11e7-98a3-52540052e4bf] has not responded in 1214.103728388 seconds, restarting worker Expected results: Additional info:
The issue here was the database connection pool in vmdb/config/database.yml A workaround for this issue should be to increase the production "pool:" value to 5 from 1 on all evmserver appliances. production: adapter: postgresql encoding: utf8 username: root - pool: 1 + pool: 5 wait_timeout: 5 min_messages: warning database: vmdb_production This value was changed after 5.5 and no mention was made in upgrade documentation. The change [1] was not made with Embedded Ansible in mind at the time, but it is necessary for it to function correctly as the setup runs in a thread in the main server process. If there is only one connection in the pool for each process the thread running the embedded ansible setup cannot connect to the database and fails. [1] https://github.com/ManageIQ/manageiq/pull/6786
This should be added to the documentation for migrating from 5.5 (4.0) - https://access.redhat.com/articles/2297391. Changing the component to documentation.
Taking this back as we will need to track a fix (and test) using this bug. We should still update the docs though. Should I open a separate bug for that or just mention it in the doc text in this one?
https://github.com/ManageIQ/manageiq/pull/16477
New commit detected on ManageIQ/manageiq/master: https://github.com/ManageIQ/manageiq/commit/a3f50f1541e710a093b21df4051a8c6d411e14e2 commit a3f50f1541e710a093b21df4051a8c6d411e14e2 Author: Nick Carboni <ncarboni> AuthorDate: Tue Nov 14 18:05:11 2017 -0500 Commit: Nick Carboni <ncarboni> CommitDate: Tue Nov 14 18:10:22 2017 -0500 Add a connection to the pool if there is only one in the EmbeddedAnsible worker Before https://github.com/ManageIQ/manageiq/pull/6786 we used to have one connection specified in the connection pool in database.yml Installations which have been upgraded from before this change still have that one connection pool in place. The EmbeddedAnsible worker uses a thread rather than a new process so it shares the connection pool with the server. When the pool is set to only contain one connection, the EmbeddedAnsible worker will not be able to start. This commit adds a connection to the pool in the same way that we do for workers which specify a specific connection pool size in their settings: https://github.com/ManageIQ/manageiq/blob/f6f7120749d16fd7825f83001dfd875cdecb903c/app/models/miq_worker/runner.rb#L71-L78 https://bugzilla.redhat.com/show_bug.cgi?id=1484150 app/models/embedded_ansible_worker.rb | 11 +++++++++++ spec/models/embedded_ansible_worker_spec.rb | 15 ++++++++++++++- 2 files changed, 25 insertions(+), 1 deletion(-)
For QE: To test this set the pool size in database.yml to 1, restart evmserverd for the change to take effect and then ensure that the embedded ansible role works properly (the worker should not be stuck in the "creating" state).
Verified in 5.10.0.1