Bug 1668534

Summary: Using operator to install ASB/TSB, it failed with error ' CERTIFICATE_VERIFY_FAILED'
Product: OpenShift Container Platform Reporter: Zihan Tang <zitang>
Component: apiserver-authAssignee: Erica von Buelow <evb>
Status: CLOSED ERRATA QA Contact: Chuan Yu <chuyu>
Severity: high Docs Contact:
Priority: high    
Version: 4.1.0CC: aos-bugs, chezhang, chuyu, dyan, eparis, jfan, jiazha, rmeggins, shurley, sponnaga
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-04 10:42:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1605136, 1662257, 1662274, 1667363, 1669368, 1678624    

Description Zihan Tang 2019-01-23 03:05:34 UTC
Description of problem:
When using asb operator to install asb, it failed at `Verify service catalog is installed`
logs of asb operator;
Set broker namespace state=present] ******************\r\n\u001b[1;30mtask path: /opt/ansible/roles/automation-broker/tasks/main.yml:3\u001b[0m\n\u001b[0;36mskipping: [localhost] => {\"changed\": false, \"skip_reason\": \"Conditional result was False\"}\u001b[0m\n\r\nTASK [automation-broker : Verify service catalog is installed] *****************\r\n\u001b[1;30mtask path: /opt/ansible/roles/automation-broker/tasks/main.yml:9\u001b[0m\n2019-01-22 07:35:48,962 WARNING Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:618)'),)': /version\r\n\n2019-01-22 07:35:48,970 WARNING Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:618)'),)': /version\r\n\n2019-01-22 07:35:48,979 WARNING Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:618)'),)': /version\r\n\n2019-01-22 07:35:48,979 WARNING Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:618)'),)': /version\n\u001b[0;31mfatal: [localhost]: FAILED! => {\"msg\": \"The conditional check '\\\"servicecatalog.k8s.io\\\" in lookup(\\\"k8s\\\", cluster_info=\\\"api_groups\\\")' failed. The error was: unhandled exception occurred while running the lookup plugin 'k8s'. An Error was a <class 'urllib3.exceptions.MaxRetryError'>, original message: HTTPSConnectionPool(host='172.30.0.1', port=443): Max retries exceeded with url: /version (Caused by SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:618)'),))\"}\u001b[0m\n\r
\nPLAY RECAP *********************************************************************\r\n\u001b[0;31mlocalhost\u001b[0m                  : \u001b[0;32mok=1   \u001b[0m changed=0    unreachable=0    \u001b[0;31mfailed=1   \u001b[0m\r\n\n","job":"3870356907665767028","name":"ansible-service-broker","namespace":"openshift-ansible-service-broker","error":"exit status 2","stacktrace":"github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/home/travis/gopath/src/github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/operator-framework/operator-sdk/pkg/ansible/runner.(*runner).Run.func1\n\t/home/travis/gopath/src/github.com/operator-framework/operator-sdk/pkg/ansible/runner/runner.go:258"}


Version-Release number of selected component (if applicable):
OLM version: 0.8.1  git commit: a36ed09
Service-catalog: v4.0.0-v0.1.38+abebed4-4-dirty;Upstream:v0.1.38
Cluster version is 4.0.0-0.alpha-2019-01-22-015156

How reproducible:
always

Steps to Reproduce:
1. install service-catalog from web-console
  a. create 'kube-service-catalog' namespace and operator group using files in 
https://github.com/fusor/catbrokers4/tree/master/files/svcat
  b. Click "Catalog"-> "Operator Hub" -> "Show community operators"-> "svcat oeprator"->'install' -> target to the "service-catalog" OperatorGroup.

2. install asb using files in https://github.com/fusor/catbrokers4/tree/master/files/asb
├── 00-asb-namespace.yaml
├── 01-asb-catalogsource-configmap.yaml
├── 02-asb-catalogsource.yaml
├── 03-asb-operatorgroup.yaml
├── 04-asb-subscription.yaml
├── 05-asb-cr.yaml
└── 06-asb-clusterrolebinding.yaml

a. create namespace and operator group
b. create cm and catalogsource.
c. create subscription  and wait for the asb operator ready
$ oc get pod -n openshift-ansible-service-broker
NAME                                          READY     STATUS    RESTARTS   AGE
automation-broker-operator-66b644bfb5-qfjtf   1/1       Running   0          2

d. create asb cr.
oc create -f 05-asb-cr.yaml
oc create -f 06-asb-clusterrolebinding.yaml

Actual results:
asb not installed successfully.
$ oc get pod -n openshift-ansible-service-broker
NAME                                          READY     STATUS    RESTARTS   AGE
automation-broker-operator-66b644bfb5-qfjtf   1/1       Running   0          2h

$ oc logs -f automation-broker-operator-66b644bfb5-qfjtf
{"level":"error","ts":1548142549.0631177,"logger":"logging_event_handler","msg":"","name":"ansible-service-broker","namespace":"openshift-ansible-service-broker","gvk":"automationbroker.io/v1alpha1, Kind=AutomationBroker","event_type":"runner_on_failed","job":"3870356907665767028","EventData.Task":"Verify service catalog is installed","EventData.TaskArgs":"msg=Service Catalog must be installed, that=[u'\"servicecatalog.k8s.io\" in lookup(\"k8s\", cluster_info=\"api_groups\")']","EventData.FailedTaskPath":"/opt/ansible/roles/automation-broker/tasks/main.yml:9","error":"[playbook task failed]","stacktrace":"github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/home/travis/gopath/src/github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/operator-framework/operator-sdk/pkg/ansible/events.loggingEventHandler.Handle\n\t/home/travis/gopath/src/github.com/operator-framework/operator-sdk/pkg/ansible/events/log_events.go:84"}
{"level":"error","ts":1548142549.225746,"logger":"runner","msg":"\u001b[0;34mansible-playbook 2.7.5\u001b[0m\r\n\u001b[0;34m  config file = /etc/ansible/ansible.cfg\u001b[0m\r\n\u001b[0;34m  configured module search path = [u'/usr/share/ansible/openshift']\u001b[0m\r\n\u001b[0;34m  ansible python module location = /usr/lib/python2.7/site-packages/ansible\u001b[0m\r\n\u001b[0;34m  executable location = /usr/bin/ansible-playbook\u001b[0m\r\n\u001b[0;34m  python version = 2.7.5 (default, Oct 30 2018, 23:45:53) [GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]\u001b[0m\r\n\u001b[0;34mUsing /etc/ansible/ansible.cfg as config file\u001b[0m\r\n\n\u001b[0;34m/tmp/ansible-operator/runner/automationbroker.io/v1alpha1/AutomationBroker/openshift-ansible-service-broker/ansible-service-broker/inventory/hosts did not meet host_list requirements, check plugin documentation if this is unexpected\u001b[0m\r\n\u001b[0;34m/tmp/ansible-operator/runner/automationbroker.io/v1alpha1/AutomationBroker/openshift-ansible-service-broker/ansible-service-broker/inventory/hosts did not meet script requirements, check plugin documentation if this is unexpected\u001b[0m\r\n\n\u001b[0;34m/tmp/ansible-operator/runner/automationbroker.io/v1alpha1/AutomationBroker/openshift-ansible-service-broker/ansible-service-broker/inventory/hosts did not meet script requirements, check plugin documentation if this is unexpected\u001b[0m\n\r\nPLAYBOOK: deploy.yml ***********************************************************\n\u001b[0;34m1 plays in /opt/ansible/deploy.yml\u001b[0m\n\r\nPLAY [automation-broker-operator] **********************************************\n\u001b[0;34mMETA: ran handlers\u001b[0m\n\r\nTASK [Validation] **************************************************************\r\n\u001b[1;30mtask path: /opt/ansible/deploy.yml:14\u001b[0m\n\u001b[0;32mok: [localhost] => {\u001b[0m\r\n\u001b[0;32m    \"changed\": false, \u001b[0m\r\n\u001b[0;32m    \"msg\": \"All assertions passed\"\u001b[0m\r\n\u001b[0;32m}\u001b[0m\
n\r\nTASK [Run automation-broker role] **********************************************\r\n\u001b[1;30mtask path: /opt/ansible/deploy.yml:21\u001b[0m\n\u001b[0;34mstatically imported: /opt/ansible/roles/automation-broker/tasks/build_config.yml\u001b[0m\n\r\nTASK [automation-broker : Set broker namespace state=present] ******************\r\n\u001b[1;30mtask path: /opt/ansible/roles/automation-broker/tasks/main.yml:3\u001b[0m\n\u001b[0;36mskipping: [localhost] => {\"changed\": false, \"skip_reason\": \"Conditional result was False\"}\u001b[0m\n\r\nTASK [automation-broker : Verify service catalog is installed] *****************\r\n\u001b[1;30mtask path: /opt/ansible/roles/automation-broker/tasks/main.yml:9\u001b[0m\n2019-01-22 07:35:48,962 WARNING Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:618)'),)': /version\r\n\n2019-01-22 07:35:48,970 WARNING Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:618)'),)': /version\r\n\n2019-01-22 07:35:48,979 WARNING Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:618)'),)': /version\r\n\n2019-01-22 07:35:48,979 WARNING Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:618)'),)': /version\n\u001b[0;31mfatal: [localhost]: FAILED! => {\"msg\": \"The conditional check '\\\"servicecatalog.k8s.io\\\" in lookup(\\\"k8s\\\", cluster_info=\\\"api_groups\\\")' failed. The error was: An unhandled exception occurred while running the lookup plugin 'k8s'. Error wa
s a <class 'urllib3.exceptions.MaxRetryError'>, original message: HTTPSConnectionPool(host='172.30.0.1', port=443): Max retries exceeded with url: /version (Caused by SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:618)'),))\"}\u001b[0m\n\r\nPLAY RECAP *********************************************************************\r\n\u001b[0;31mlocalhost\u001b[0m                  : \u001b[0;32mok=1   \u001b[0m changed=0    unreachable=0    \u001b[0;31mfailed=1   \u001b[0m\r\n\n","job":"3870356907665767028","name":"ansible-service-broker","namespace":"openshift-ansible-service-broker","error":"exit status 2","stacktrace":"github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/home/travis/gopath/src/github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/operator-framework/operator-sdk/pkg/ansible/runner.(*runner).Run.func1\n\t/home/travis/gopath/src/github.com/operator-framework/operator-sdk/pkg/ansible/runner/runner.go:258"}

checking service-catalog pod, pods are running.

logs of service-catalog api-server pod:
$ oc logs -f apiserver-69c7c75b59-s5qg4 -c apiserver -n kube-service-catalog
...
I0122 08:22:21.016748       1 wrap.go:42] GET /healthz: (630.04µs) 200 [kube-probe/1.11+ 10.131.0.1:43930]
E0122 08:22:22.459006       1 authentication.go:62] Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid
I0122 08:22:22.459133       1 wrap.go:42] GET /: (207.853µs) 401 [Go-http-client/2.0 10.128.0.1:54860]
E0122 08:22:22.463939       1 authentication.go:62] Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid
I0122 08:22:22.464004       1 wrap.go:42] GET /: (124.01µs) 401 [Go-http-client/2.0 10.128.0.1:54860]
E0122 08:22:22.464085       1 authentication.go:62] Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid
I0122 08:22:22.464175       1 wrap.go:42] GET /: (134.01µs) 401 [Go-http-client/2.0 10.128.0.1:54860]
E0122 08:22:22.486403       1 authentication.go:62] Unable to authenticate the request due to an error: x509: certificate has expired or is not yet valid
...

Expected results:
install asb successfully.

Additional info:

Comment 1 Zhang Cheng 2019-01-25 08:12:11 UTC
Adding TestBlocker keyword since it is blocking automation broker testing

Comment 2 Zihan Tang 2019-01-28 03:16:11 UTC
Install templateservicebroker, also hit this issue.

$ oc get templateservicebroker -o yaml -n openshift-template-service-broker 
apiVersion: v1
items:
- apiVersion: osb.openshift.io/v1alpha1
  kind: TemplateServiceBroker
  metadata:
    creationTimestamp: 2019-01-28T03:02:54Z
    finalizers:
    - finalizer.osb.openshift.io
    generation: 1
    name: template-service-broker
    namespace: openshift-template-service-broker
    resourceVersion: "38123"
    selfLink: /apis/osb.openshift.io/v1alpha1/namespaces/openshift-template-service-broker/templateservicebrokers/template-service-broker
    uid: 34c248fd-22a9-11e9-9f04-060ee74b6582
  spec: {}
  status:
    conditions:
    - ansibleResult:
        changed: 1
        completion: 2019-01-28T03:03:02.349635
        failures: 1
        ok: 4
        skipped: 0
      lastTransitionTime: 2019-01-28T03:03:02Z
      message: 'An unhandled exception occurred while running the lookup plugin ''k8s''.
        Error was a <class ''urllib3.exceptions.MaxRetryError''>, original message:
        HTTPSConnectionPool(host=''172.30.0.1'', port=443): Max retries exceeded with
        url: /version (Caused by SSLError(SSLError(1, u''[SSL: CERTIFICATE_VERIFY_FAILED]
        certificate verify failed (_ssl.c:618)''),))'
      reason: Failed
      status: "True"
      type: Failure
    - lastTransitionTime: 2019-01-28T03:10:50Z
      message: Running reconciliation
      reason: Running
      status: "False"
      type: Running
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Comment 3 Shawn Hurley 2019-01-30 17:24:10 UTC
This is a problem with ca.crt that is delivered to the operator. This is being tracked here: https://jira.coreos.com/browse/AUTH-235

Comment 8 Zihan Tang 2019-02-18 07:43:51 UTC
In 4.0.0-0.nightly-2019-02-17-024922, I didn't hit this, remove TestBlocker.

Comment 9 Zihan Tang 2019-02-19 09:11:58 UTC
When install ASB, I haven't hit it in the recent builds,
but TSB still install failed with this error:
message: 'An unhandled exception occurred while running the lookup plugin ''template''.
        Error was a <class ''ansible.errors.AnsibleError''>, original message: An
        unhandled exception occurred while running the lookup plugin ''k8s''. Error
        was a <class ''urllib3.exceptions.MaxRetryError''>, original message: HTTPSConnectionPool(host=''172.30.0.1'',
        port=443): Max retries exceeded with url: /version (Caused by SSLError(SSLError(1,
        u''[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:618)''),))'

Comment 10 Shawn Hurley 2019-02-19 14:52:21 UTC
This is still an auth team bug. Please re-assign as discussed.

Comment 11 Zihan Tang 2019-02-20 03:35:44 UTC
As discussed, resign to AUTH, related jira task : https://jira.coreos.com/browse/AUTH-235

Comment 12 Zihan Tang 2019-02-20 05:30:29 UTC
TSB install still hit this issue, move to ASSIGNED.

Comment 13 Erica von Buelow 2019-02-20 17:08:49 UTC
Which version of the installer are you using? It's likely you're missing the patch.

Comment 15 Zihan Tang 2019-02-21 06:38:07 UTC
(In reply to Erica von Buelow from comment #13)
> Which version of the installer are you using? It's likely you're missing the
> patch.

With my last env, the installer version is : v4.0.0-0.174.0.0-dirty, OCP Version is: 4.0.0-0.nightly-2019-02-17-024922

Comment 16 Zihan Tang 2019-02-21 09:23:51 UTC
With installer: v4.0.0-0.177.0.1-dirty
Cluster version is 4.0.0-0.nightly-2019-02-20-194410
this issue is fixed.
Thanks

Comment 18 Standa Laznicka 2019-03-12 17:11:14 UTC
*** Bug 1670282 has been marked as a duplicate of this bug. ***

Comment 20 errata-xmlrpc 2019-06-04 10:42:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758

Comment 21 Red Hat Bugzilla 2023-09-14 04:45:28 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days