Bug 1465590

Summary: Embedded ansible role fails to re-initialize after webui update
Product: Red Hat CloudForms Management Engine Reporter: luke couzens <lcouzens>
Component: ApplianceAssignee: Nick Carboni <ncarboni>
Status: CLOSED CURRENTRELEASE QA Contact: luke couzens <lcouzens>
Severity: high Docs Contact:
Priority: unspecified    
Version: 5.8.0CC: abellott, cpelland, jhardy, obarenbo
Target Milestone: GAKeywords: TestOnly, ZStream
Target Release: 5.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: ansible_embed:black
Fixed In Version: 5.9.0.1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1466855 (view as bug list) Environment:
Last Closed: 2018-03-06 14:48:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: CFME Core Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1466855    

Description luke couzens 2017-06-27 17:18:25 UTC
Description of problem:Embedded ansible role fails to re-initialize after webui update


Version-Release number of selected component (if applicable):5.8.1.0


How reproducible:100%


Steps to Reproduce:
1.provision 5.8.0.17 appliance
2.enable embedded ansible
3.add ansible repo to confirm its full enabled
4.nav to configuration-settings-region-Red Hat Updates
5.edit registration
6.setup RHSM
7.Validate and save settings
8.add update repo file to /etc/yum.repo.d/
6.check for updates in the ui
7.apply 5.8.1.0 update
8.check notifications that embedded ansible gets enable

Actual results:notifications flooded, and evm.log filled with:

[----] I, [2017-06-27T13:15:11.758650 #19283:39d12c]  INFO -- : MIQ(MiqQueue#delivered) Message id: [767], State: [ok], Delivered in [0.607767515] seconds
[----] I, [2017-06-27T13:15:13.036533 #16264:ca239e0]  INFO -- : MIQ(EmbeddedAnsible.start) Waiting for EmbeddedAnsible to respond
[----] I, [2017-06-27T13:15:14.207771 #16264:ca239e0]  INFO -- : MIQ(EmbeddedAnsible.start) Waiting for EmbeddedAnsible to respond
[----] I, [2017-06-27T13:15:15.282358 #16264:ca239e0]  INFO -- : MIQ(EmbeddedAnsible.start) Waiting for EmbeddedAnsible to respond
[----] I, [2017-06-27T13:15:16.352560 #16264:ca239e0]  INFO -- : MIQ(EmbeddedAnsible.start) Waiting for EmbeddedAnsible to respond
[----] E, [2017-06-27T13:15:17.353559 #16264:ca239e0] ERROR -- : [RuntimeError]: EmbeddedAnsible service is not responding after setup  Method:[rescue in do_before_work_loop]
[----] E, [2017-06-27T13:15:17.353724 #16264:ca239e0] ERROR -- : /var/www/miq/vmdb/lib/embedded_ansible.rb:61:in `start'
/var/www/miq/vmdb/app/models/embedded_ansible_worker/runner.rb:36:in `setup_ansible'
/var/www/miq/vmdb/app/models/embedded_ansible_worker/runner.rb:13:in `do_before_work_loop'
/var/www/miq/vmdb/app/models/embedded_ansible_worker/runner.rb:7:in `prepare'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:133:in `start'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:21:in `start_worker'
/var/www/miq/vmdb/app/models/embedded_ansible_worker.rb:10:in `block in start_runner'



Expected results:Embedded ansible re-enables correctly after z stream update.


Additional info:

I also found that I ran out of disk space when initially trying to run the update. I used the following commands to get around this.

yum -y install xfsdump
xfsdump -F -f /tmp/repo /repo
umount /repo
lvremove -f /dev/VG-CFME/lv_repo
lvcreate --yes -L 1GB -n lv_repo VG-CFME
mkfs.xfs /dev/VG-CFME/lv_repo
mount /dev/VG-CFME/lv_repo /repo
xfsrestore -f /tmp/repo /repo
lvextend --resizefs --size +9GB /dev/VG-CFME/lv_var

Comment 3 Nick Carboni 2017-06-29 19:44:37 UTC
So this is happening because we need to run the ansible-tower setup playbook when they change versions.

This is evidenced by the following error in the /var/log/supervisor/awx-daphne.log:

2017-06-29 15:08:38,089 ERROR    Missing or incorrect metadata for Tower version.  Ensure Tower was installed using the setup playbook.
Traceback (most recent call last):
  File "/var/lib/awx/venv/tower/bin/daphne", line 9, in <module>
    load_entry_point('daphne==0.15.0', 'console_scripts', 'daphne')()
  File "/var/lib/awx/venv/tower/lib/python2.7/site-packages/daphne/cli.py", line 105, in entrypoint
    cls().run(sys.argv[1:])
  File "/var/lib/awx/venv/tower/lib/python2.7/site-packages/daphne/cli.py", line 135, in run
    channel_layer = importlib.import_module(module_path)
  File "/usr/lib64/python2.7/importlib/__init__.py", line 37, in import_module
    __import__(name)
  File "/lib/python2.7/site-packages/awx/asgi.py", line 31, in <module>
Exception: Missing or incorrect metadata for Tower version.  Ensure Tower was installed using the setup playbook.

We removed that from the worker startup in https://github.com/ManageIQ/manageiq/pull/15225 as a fix for bug 1455063

This is being raised because the installed version does not match the contents of the /var/lib/awx/.tower_version file.

My proposed fix is to implement a similar check in our code which will result in us running the setup playbook rather than starting the services directly.

Comment 5 CFME Bot 2017-06-30 14:31:50 UTC
New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/72320a3dfffb2196541b90d720d811f2c706f00f

commit 72320a3dfffb2196541b90d720d811f2c706f00f
Author:     Nick Carboni <ncarboni>
AuthorDate: Thu Jun 29 17:23:21 2017 -0400
Commit:     Nick Carboni <ncarboni>
CommitDate: Thu Jun 29 17:23:21 2017 -0400

    Run the setup playbook if we see that an upgrade has happened
    
    The playbook needs to run on upgrades and ansible tower will refuse
    to start if it isn't.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1465590

 lib/embedded_ansible.rb           | 17 ++++++++++++++++-
 spec/lib/embedded_ansible_spec.rb | 34 +++++++++++++++++++++++++++++++++-
 2 files changed, 49 insertions(+), 2 deletions(-)

Comment 7 luke couzens 2017-11-28 11:40:50 UTC
Verified in 5.9.0.11