Bug 1573903

Summary: ansible provider token invalid when using ansible playbook and pulling data from another appliance than the ansible enabled one
Product: Red Hat CloudForms Management Engine Reporter: Felix Dewaleyne <fdewaley>
Component: ApplianceAssignee: abellott
Status: CLOSED NOTABUG QA Contact: Dave Johnson <dajohnso>
Severity: high Docs Contact:
Priority: high    
Version: 5.9.0CC: abellott, ahoness, cpelland, fdewaley, gblomqui, gekis, greartes, gtanzill, jfrey, jhardy, obarenbo
Target Milestone: GA   
Target Release: 5.9.3   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-04 13:17:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
additional debugging for api token authentication against a sql store none

Description Felix Dewaleyne 2018-05-02 13:31:29 UTC
Description of problem:
ansible provider token invalid when using ansible playbook and pulling data from another appliance than the ansible enabled one 

Version-Release number of selected component (if applicable):
5.9.1

How reproducible:
all the time

Steps to Reproduce:
1. set up playbook with https://github.com/mkanoor/playbook/blob/master/set_automate_retry.yml
2. execute the playbook 
3. 

Actual results:
if the workspace is pulled from another appliance than the ansible one, an invalid token error is raised.
if the workspace is pulled from the ansible appliance no error is raised

Expected results:
the token is valid no matter the appliance

Additional info:
The playbook is always executed in the appliance with the the role of Ansible Embedded enabled, so the role works fine. 
Both appliances has the roles Web Services and Web Socket enabled.

Comment 2 Greg Blomquist 2018-05-02 14:41:42 UTC
Felix can you capture logs from the customer?

Comment 3 Greg Blomquist 2018-05-02 14:45:57 UTC
Felix, we're entertaining a number of possibilities that might be causing problems.

One possible issue is NTP configuration.  Can you make sure that all of the customer's CF appliances are correctly configured with NTP, and have the correct time.  And, that the appliances' server times matches the Tower server time.

Comment 5 Gregg Tanzillo 2018-05-02 15:05:51 UTC
In order for a token that is created on one server to work on another server, the server/session_store needs to be "sql" the default is "cache". Please verify this setting with the customer. They'll find it in advanced settings under

:server:
  :session_store:

Also, there is a 10 minute timeout on the auth token. If the playbook is long running and the token is not used until after 10 minutes has past, it will also fail. This is configurable too.

Comment 6 Felix Dewaleyne 2018-05-02 18:58:46 UTC
(In reply to Gregg Tanzillo from comment #5)
> In order for a token that is created on one server to work on another
> server, the server/session_store needs to be "sql" the default is "cache".
> Please verify this setting with the customer. They'll find it in advanced
> settings under
> 
> :server:
>   :session_store:
> 
> Also, there is a 10 minute timeout on the auth token. If the playbook is
> long running and the token is not used until after 10 minutes has past, it
> will also fail. This is configurable too.

The customer performed the change from cache to sql to no avail. 

the same time server is used on the whole environment but some appliances were not set with the right timezone. they are updating that then will come back.

Comment 9 abellott 2018-05-03 19:20:03 UTC
We have tried to recreate the issue here with the ansible playbook but all worked well for us.

- We had two appliances configured to the same DB
- both having Server.session_store set to sql
- Started the playbook on either appliance
- Used the manageiq.api_token with an API request successfully on both appliances
- Tried with different time zones on the 2 appliances
  - EST/EST
  - Europe/Copenhagen / EST
  - Europe/Copenhagen / Europe/Copenhagen
  - Europe/Copenhagen / Asia

We are providing here an initializer that would provide additional logging in the api.log regarding the cause of the failed authentication.

Please create the /var/www/miq/vmdb/config/initializers/api_auth_token_debug.rb
to both appliances with the attached file then restart both appliances.

Rerun the playbook then provide all logs (especially the api.log) from both automate and api servers.

Comment 10 abellott 2018-05-03 19:21:52 UTC
Created attachment 1430868 [details]
additional debugging for api  token authentication against a sql store

Comment 11 Gregg Tanzillo 2018-05-04 13:17:45 UTC
Closing, per customer -

We have changed in all appliances the session_store parameter from "cache" to "sql" and have made some tests. It works. The problem we had yesterday is that we only changed this parameter in the appliance we were logged in. 

I am going to close this case and the other one related with this problem too.
Thank you for your support!