Bug 1808422 - If OLM catalog operator cannot reach the API server, it does not seem to retry
Summary: If OLM catalog operator cannot reach the API server, it does not seem to retry
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.2.z
Assignee: Ben Luddy
QA Contact: Bruno Andrade
URL:
Whiteboard:
Depends On: 1808419
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-28 13:49 UTC by Ben Luddy
Modified: 2020-04-14 11:58 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-14 11:58:21 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github operator-framework operator-lifecycle-manager pull 1422 0 None closed Bug 1808422: Don't block on ctx.Done() if startup fails 2020-04-07 04:39:31 UTC
Red Hat Product Errata RHBA-2020:1398 0 None None None 2020-04-14 11:58:41 UTC

Description Ben Luddy 2020-02-28 13:49:53 UTC
This bug was initially created as a copy of Bug #1808419

I am copying this bug because: 



This bug was initially created as a copy of Bug #1808418

I am copying this bug because: 



This bug was initially created as a copy of Bug #1807128

I am copying this bug because: 



I have an install stuck at: level=debug msg="Still waiting for the cluster to initialize: Cluster operator operator-lifecycle-manager-catalog has not yet reported success"

oc get clusteroperators does not show operator-lifecycle-manager-catalog... and the logs show:

$ oc logs $POD -n openshift-operator-lifecycle-managertime="2020-02-25T14:58:32Z" level=info msg="log level info"
time="2020-02-25T14:58:32Z" level=info msg="TLS keys set, using https for metrics"
W0225 14:58:32.552916       1 client_config.go:541] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
time="2020-02-25T14:58:32Z" level=info msg="Using in-cluster kube client config"
time="2020-02-25T14:58:32Z" level=info msg="Using in-cluster kube client config"
W0225 14:58:32.557542       1 client_config.go:541] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
time="2020-02-25T14:58:32Z" level=info msg="Using in-cluster kube client config"
time="2020-02-25T14:58:32Z" level=info msg="operator not ready: communicating with server failed: Get https://172.30.0.1:443/version?timeout=32s: dial tcp 172.30.0.1:443: connect: connection refused"
time="2020-02-25T14:58:32Z" level=info msg="ClusterOperator api not present, skipping update (Get https://172.30.0.1:443/api?timeout=32s: dial tcp 
172.30.0.1:443: connect: connection refused)"



However, currently the API is now available:

$ oc rsh -n openshift-operator-lifecycle-manager $POD                        
sh-4.2$ curl -k https://172.30.0.1:443/api?timeout=32s:
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {
  },
  "status": "Failure",
  "message": "forbidden: User \"system:anonymous\" cannot get path \"/api\"",
  "reason": "Forbidden",
  "details": {
  },
  "code": 403
}sh-4.2$ 



But it appears the operator is not retrying.

Comment 3 Bruno Andrade 2020-04-07 21:48:21 UTC
Installed cluster and left it installed for approximately one day and OLM Cluster Operators are running as expected. Marking as VERIFIED.


OCP Cluster Version: 4.2.0-0.nightly-2020-04-07-084038

oc get clusteroperators | grep "operator-lifecycle-manager*"    
operator-lifecycle-manager                 4.2.0-0.nightly-2020-04-07-084038   True        False         False      23h30m
operator-lifecycle-manager-catalog         4.2.0-0.nightly-2020-04-07-084038   True        False         False      23h30m
operator-lifecycle-manager-packageserver   4.2.0-0.nightly-2020-04-07-084038   True        False         False      23h29m
                                 
oc get pods -n openshift-operator-lifecycle-manager                                                                    

NAME                                READY   STATUS    RESTARTS   AGE
catalog-operator-588d48887d-gtt58   1/1     Running   0          23h30m
olm-operator-55b67ccb89-vxgvg       1/1     Running   0          23h30m
packageserver-787b96f55d-dg2wm      1/1     Running   0          23h30m
packageserver-787b96f55d-xpq59      1/1     Running   0          23h29m

Comment 5 errata-xmlrpc 2020-04-14 11:58:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1398


Note You need to log in before you can comment on or make changes to this bug.