Bug 1594704

Summary: The remote_ws_url value does not failover if the appliance is stopped, so "api_url" can be incorrect in an Ansible playbook
Product: Red Hat CloudForms Management Engine Reporter: Peter McGowan <pmcgowan>
Component: ApplianceAssignee: Brandon Dunne <bdunne>
Status: CLOSED CURRENTRELEASE QA Contact: Satyajit Bulage <sbulage>
Severity: high Docs Contact:
Priority: high    
Version: 5.9.0CC: abellott, cpelland, gtanzill, jprause, obarenbo, simaishi, smallamp
Target Milestone: GAKeywords: TestOnly, ZStream
Target Release: 5.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 5.10.0.4 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1607409 (view as bug list) Environment:
Last Closed: 2019-02-11 14:05:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: CFME Core Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1607409    

Description Peter McGowan 2018-06-25 09:01:45 UTC
Description of problem:
When an embedded Ansible playbook runs, it can access a variable "api_url" to connect back to an appliance running the web services role. This value seems to be populated from the expression:

MiqRegion.my_region.remote_ws_url

which is populated from the value of:

MiqServer.in_region(n).find_by(:has_active_webservices => true)

This value can however be an appliance that's shut down or unreachable, which means that any Ansible playbook referencing this variable will fail. It seems that when an appliance is shut down or is unreachable, it is not removed from the possible list of servers with :has_active_webservices => true.

Version-Release number of selected component (if applicable):
5.9.2.4

How reproducible:
Every time

Steps to Reproduce:
1. In a multi-appliance region, in rails console on one appliance, run the command MiqRegion.my_region.remote_ws_url. Observe the IP address returned.
2. Shutdown (or stop evmserverd/poweroff) the server with the returned IP address.
3. Re-run the command MiqRegion.my_region.remote_ws_url from another appliance.

Actual results:
The IP address doesn't change even though the appliance is now unavailable

Expected results:
The IP address should failover to a currently accessible appliance running the web services.

Additional info:

Comment 4 CFME Bot 2018-07-10 01:07:01 UTC
New commits detected on ManageIQ/manageiq/master:

https://github.com/ManageIQ/manageiq/commit/4988e7290a48f35d5821527009c388b7a2bb5b60
commit 4988e7290a48f35d5821527009c388b7a2bb5b60
Author:     Brandon Dunne <brandondunne>
AuthorDate: Thu Jul  5 11:21:54 2018 -0400
Commit:     Brandon Dunne <brandondunne>
CommitDate: Thu Jul  5 11:21:54 2018 -0400

    Add a scope for MiqServer.recently_active

    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1594704

 app/models/miq_server.rb | 1 +
 1 file changed, 1 insertion(+)


https://github.com/ManageIQ/manageiq/commit/b0ab37bc3e909d8915181cfdf476a1c2d5183614
commit b0ab37bc3e909d8915181cfdf476a1c2d5183614
Author:     Brandon Dunne <brandondunne>
AuthorDate: Thu Jul  5 11:25:51 2018 -0400
Commit:     Brandon Dunne <brandondunne>
CommitDate: Thu Jul  5 11:25:51 2018 -0400

    Scope remote ui and api server searches to recently active servers

    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1594704

 app/models/miq_region.rb | 4 +-
 spec/models/miq_region_spec.rb | 92 +-
 2 files changed, 71 insertions(+), 25 deletions(-)

Comment 6 Satyajit Bulage 2018-11-20 17:36:23 UTC
Verification Steps:

1. Created setup of multi-Appliance with 3 appliances. Enabled Failover service.

2. Later executed command from the BZ description.
3. After executing "MiqServer.in_region(n).find_by(:has_active_webservices => true)" and "MiqRegion.my_region.remote_ws_url" getting IP address of the available appliance not the evmserverd stopped appliance. (Result may vary depending upon heartbeat time.)

Verified Version: 5.10.0.23.20181106165157_92dd189