Bug 1484150 - Embedded ansible fails to start. Can't create credentials or add repositories.
Summary: Embedded ansible fails to start. Can't create credentials or add repositories.
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Appliance
Version: 5.8.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: GA
: 5.10.0
Assignee: Nick Carboni
QA Contact: luke couzens
URL:
Whiteboard: ansible_embed:black:migration:upgrade
Depends On:
Blocks: 1513631 1514139
TreeView+ depends on / blocked
 
Reported: 2017-08-22 20:46 UTC by Saif Ali
Modified: 2021-03-11 15:38 UTC (History)
12 users (show)

Fixed In Version: 5.10.0.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1513631 1514139 (view as bug list)
Environment:
Last Closed: 2019-02-11 13:54:29 UTC
Category: ---
Cloudforms Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Saif Ali 2017-08-22 20:46:30 UTC
Description of problem:
When CloudForms appliance base version is 4.0, and then upgrade to 4.1 >> 4.2, and then 4.5 Embedded ansible fails to start.

Version-Release number of selected component (if applicable):
4.5

How reproducible:
Deploy 4.0 appliance

Steps to Reproduce:
1. upgrade to 4.1 
2. upgrade to 4.2
3. upgrade to 4.5, and enable Embedded ansible server role.

Actual results:
The wroker stuck in creating status 
EmbeddedAnsibleWorker                                             | creating | 1000000637582 |      |      | 1000000000001 |                         |                      

Or 

[----] E, [2017-08-18T16:17:43.712452 #1621:933140] ERROR -- : MIQ(MiqServer#validate_worker) Worker [EmbeddedAnsibleWorker] with ID: [1000000000056], PID: [], GUID: [77453902-844f-11e7-98a3-52540052e4bf] has not responded in 1214.103728388 seconds, restarting worker


Expected results:


Additional info:

Comment 6 Nick Carboni 2017-11-14 21:16:30 UTC
The issue here was the database connection pool in vmdb/config/database.yml

A workaround for this issue should be to increase the production "pool:" value to 5 from 1 on all evmserver appliances.

production:
  adapter: postgresql
  encoding: utf8
  username: root
- pool: 1
+ pool: 5
  wait_timeout: 5
  min_messages: warning
  database: vmdb_production

This value was changed after 5.5 and no mention was made in upgrade documentation.

The change [1] was not made with Embedded Ansible in mind at the time, but it is necessary for it to function correctly as the setup runs in a thread in the main server process. If there is only one connection in the pool for each process the thread running the embedded ansible setup cannot connect to the database and fails.

[1] https://github.com/ManageIQ/manageiq/pull/6786

Comment 7 Nick Carboni 2017-11-14 21:51:15 UTC
This should be added to the documentation for migrating from 5.5 (4.0) - https://access.redhat.com/articles/2297391.

Changing the component to documentation.

Comment 8 Nick Carboni 2017-11-14 23:12:49 UTC
Taking this back as we will need to track a fix (and test) using this bug.

We should still update the docs though. Should I open a separate bug for that or just mention it in the doc text in this one?

Comment 10 CFME Bot 2017-11-15 14:11:39 UTC
New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/a3f50f1541e710a093b21df4051a8c6d411e14e2

commit a3f50f1541e710a093b21df4051a8c6d411e14e2
Author:     Nick Carboni <ncarboni>
AuthorDate: Tue Nov 14 18:05:11 2017 -0500
Commit:     Nick Carboni <ncarboni>
CommitDate: Tue Nov 14 18:10:22 2017 -0500

    Add a connection to the pool if there is only one in the EmbeddedAnsible worker
    
    Before https://github.com/ManageIQ/manageiq/pull/6786 we used to
    have one connection specified in the connection pool in database.yml
    
    Installations which have been upgraded from before this change
    still have that one connection pool in place.
    
    The EmbeddedAnsible worker uses a thread rather than a new process
    so it shares the connection pool with the server. When the pool
    is set to only contain one connection, the EmbeddedAnsible worker
    will not be able to start.
    
    This commit adds a connection to the pool in the same way that we do
    for workers which specify a specific connection pool size in their settings:
    https://github.com/ManageIQ/manageiq/blob/f6f7120749d16fd7825f83001dfd875cdecb903c/app/models/miq_worker/runner.rb#L71-L78
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1484150

 app/models/embedded_ansible_worker.rb       | 11 +++++++++++
 spec/models/embedded_ansible_worker_spec.rb | 15 ++++++++++++++-
 2 files changed, 25 insertions(+), 1 deletion(-)

Comment 11 Nick Carboni 2017-11-15 14:25:55 UTC
For QE: To test this set the pool size in database.yml to 1, restart evmserverd for the change to take effect and then ensure that the embedded ansible role works properly (the worker should not be stuck in the "creating" state).

Comment 18 luke couzens 2018-06-20 19:26:34 UTC
Verified in 5.10.0.1


Note You need to log in before you can comment on or make changes to this bug.