Bug 1504935 - Cannot add Openshift provider 3.6 to CloudForms - Hawkular will not validate
Summary: Cannot add Openshift provider 3.6 to CloudForms - Hawkular will not validate
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Providers
Version: 5.8.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: GA
: 5.8.3
Assignee: Beni Paskin-Cherniavsky
QA Contact: Dave Johnson
URL:
Whiteboard: container
Depends On:
Blocks: 1503797
TreeView+ depends on / blocked
 
Reported: 2017-10-20 19:16 UTC by Saif Ali
Modified: 2020-12-14 10:36 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-10-26 16:07:16 UTC
Category: ---
Cloudforms Team: Container Management
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Saif Ali 2017-10-20 19:16:41 UTC
Description of problem:
Cannot add Openshift provider 3.6 to CloudForms - Hawkular will not validate
~~~
OCP URL validates correctly but Hawkular URL will not validate.  Error message is "Credential validation was not successful: 743: unexpected token at ' "
Openshift version is 3.6.
~~~

Version-Release number of selected component (if applicable):
CloudForms 4.2, and 4.5 

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 4 Beni Paskin-Cherniavsky 2017-10-23 13:11:38 UTC
Let's see.  These 2 messages repeat >130 times:

MIQ(ManageIQ::Providers::OpenshiftEnterprise::ContainerManager#authentication_check_no_validation) type: [:bearer] for [1000000000001] [NOCP_3.4] Validation failed: error, HTTP status code 401, 401 Unauthorized

MIQ(ManageIQ::Providers::OpenshiftEnterprise::ContainerManager#authentication_check_no_validation) type: [:hawkular] for [1000000000001] [NOCP_3.4] Validation failed: error, 743: unexpected token at '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>The URL you re...

(some other failures appear just few times, I assume transient or fixed)

For main endpoint, sounds like token is wrong.  Does management-admin service account exist?
Well, you said "OCP URL validates correctly" so I assume that got resolved.

=> For hawkular: Unfortunately, we're not logging enough details, but it sounds like you're not talking to hawkular.
- Are hawkular pods alive?
- Is it exposed correctly through a route?
- Did you use same hostname as in route?
- Did you use the port that openshift router listens on (normally 443)?

For further advice, would need:
- output of `oc get route -n openshift-infra hawkular-metrics -o yaml`
- output of `oc status -n openshift-infra -v`
- screenshot of add provider screen, at least the hawkular tab

Comment 12 Beni Paskin-Cherniavsky 2017-10-24 06:45:37 UTC
    rc/hawkular-metrics created 4 days ago - 0/1 pods

    rc/heapster created 4 days ago - 0/1 pods

  * pod/hawkular-metrics-vq03p has restarted within the last 10 minutes
  * pod/heapster-ggmbh has restarted within the last 10 minutes

Should look at `oc get events -n management-infra` and `oc logs` for these pods to see why they are restarting / not running.

== Creating missing service accounts: ==

On 3.6 the playbook part that creates it normally is: https://github.com/openshift/openshift-ansible/blob/release-3.6/roles/openshift_manageiq/tasks/main.yaml

Is this a new openshift cluster?
There was recent bug in openshift-ansible where the account is not created by default (https://github.com/openshift/openshift-ansible/pull/5809).

- The "official" way to fix is rerun openshift-ansible with openshift_use_manageiq=true in [vars] section of inventory.
  I don't know how safe that is if the cluster is already in production use, should consult someone from Openshift.

- https://access.redhat.com/documentation/en-us/red_hat_cloudforms/4.0/html/managing_providers/containers_providers#configuring_service_accounts 
  is out of date, on multiple points.

Comment 15 Beni Paskin-Cherniavsky 2017-10-26 16:07:16 UTC
NOTABUG, CFME->hawkular access was blocked by firewall, after understanding that customer was able to fix and add provider.

Opened RFE bug 1506718 to make such situations easier to diagnose.


Note You need to log in before you can comment on or make changes to this bug.