Bug 1779043 - The jaeger operator couldn't be deployed due to the CatalogSource pods fail to respond
Summary: The jaeger operator couldn't be deployed due to the CatalogSource pods fail t...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.1.z
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: 4.1.z
Assignee: Bowen Song
QA Contact: Jian Zhang
URL:
Whiteboard:
Depends On: 1781261
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-12-03 06:45 UTC by Anping Li
Modified: 2020-02-18 13:46 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1781261 (view as bug list)
Environment:
Last Closed: 2020-02-18 13:46:51 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
csc pod manifest and catalog operators logs (276.12 KB, application/gzip)
2019-12-03 06:46 UTC, Anping Li
no flags Details

Description Anping Li 2019-12-03 06:45:01 UTC
Description of problem:
The stable jaeger operator couldn't be deployed in 4.1.  the catalogsource raise the following error


time="2019-12-03T05:40:51Z" level=info msg="loading bundle file" dir=downloaded file=jaeger-v1.crd.yaml load=bundle
time="2019-12-03T05:40:51Z" level=info msg="loading bundle file" dir=downloaded file=jaeger.v1.13.1.clusterserviceversion.yaml load=bundle
time="2019-12-03T05:40:51Z" level=info msg="could not decode contents of file downloaded/jaeger-product-aiiua3zr/jaeger.package.yaml into csv: error unmarshaling JSON: Object 'Kind' is missing in '{\"channels\":[{\"currentCSV\":\"jaeger-operator.v1.13.1\",\"name\":\"stable\"}],\"packageName\":\"jaeger-product\"}'" dir=downloaded file=jaeger.package.yaml load=bundles
time="2019-12-03T05:40:51Z" level=info msg="Loading packages and entries" dir=downloaded


Version-Release number of selected component (if applicable):

OLM version:
io.openshift.build.commit.url=https://github.com/operator-framework/operator-lifecycle-manager/commit/1bd1fe142d7ae74fd53829132b66e0455716e035

Cluster version:  
  4.1.0-0.nightly-2019-11-28-220857

Jadger: 
  "currentCSV":"jaeger-operator.v1.13.1"


How reproducible:
This time

Steps to Reproduce:
1. Deploy elasticsearch operators
2. Deploy cluster-logging operators
3. Deploy stable jaeger 

Actual results:
1. The catalogsource are created and running 
[ocp41_73053]$ oc get pods
NAME                                                              READY   STATUS    RESTARTS   AGE
certified-operators-5c755d7664-wz79w                              1/1     Running   0          138m
cluster-logging-operator-75597bc855-6x2qb                         1/1     Running   0          17m
community-operators-556477f67f-6wtbb                              1/1     Running   0          138m
elasticsearch-74f7959575-4dcrz                                    1/1     Running   0          17m
installed-custom-openshift-ansible-service-broker-59664cdft6dtr   1/1     Running   0          17m
installed-custom-openshift-template-service-broker-59df8c8pttsp   1/1     Running   0          17m
installed-redhat-openshift-operators-565cdc5448-cvb9n             1/1     Running   0          41m
marketplace-operator-5b8cf489b6-zhlf4                             1/1     Running   0          138m

2. The catalogsoruce pod report :could not decode contents of file downloaded/jaeger-product-aiiua3zr/jaeger.package.yaml into csv: error unmarshaling JSON: Object 'Kind' is missing in '{\"channels\":[{\"currentCSV\":\"jaeger-operator.v1.13.1\",\"name\":\"stable\"}],\"packageName\":\"jaeger-product\"}'" dir=downloaded file=jaeger.package.yaml load=bundles

3. oc describe sub jaeger-product -n openshift-operators
Name:         jaeger-product
Namespace:    openshift-operators
Labels:       csc-owner-name=installed-redhat-openshift-operators
              csc-owner-namespace=openshift-marketplace
Annotations:  <none>
API Version:  operators.coreos.com/v1alpha1
Kind:         Subscription
Metadata:
  Creation Timestamp:  2019-12-03T05:24:50Z
  Generation:          1
  Resource Version:    41994
  Self Link:           /apis/operators.coreos.com/v1alpha1/namespaces/openshift-operators/subscriptions/jaeger-product
  UID:                 3ab3fff8-158d-11ea-89b4-0a7160cfddb4
Spec:
  Channel:                stable
  Install Plan Approval:  Automatic
  Name:                   jaeger-product
  Source:                 installed-redhat-openshift-operators
  Source Namespace:       openshift-operators
  Starting CSV:           jaeger-operator.v1.13.1
Events:                   <none>

3. oc describe csc installed-redhat-openshift-operators
Name:         installed-redhat-openshift-operators
Namespace:    openshift-marketplace
Labels:       <none>
Annotations:  <none>
API Version:  operators.coreos.com/v1
Kind:         CatalogSourceConfig
Metadata:
  Creation Timestamp:  2019-12-03T05:24:50Z
  Finalizers:
    finalizer.catalogsourceconfigs.operators.coreos.com
  Generation:        3
  Resource Version:  41983
  Self Link:         /apis/operators.coreos.com/v1/namespaces/openshift-marketplace/catalogsourceconfigs/installed-redhat-openshift-operators
  UID:               3a815016-158d-11ea-89b4-0a7160cfddb4
Spec:
  Cs Display Name:   Red Hat Operators
  Cs Publisher:      Red Hat
  Packages:          jaeger-product
  Target Namespace:  openshift-operators
Status:
  Current Phase:
    Last Transition Time:  2019-12-03T05:24:50Z
    Last Update Time:      2019-12-03T05:24:50Z
    Phase:
      Message:  The object has been successfully reconciled
      Name:     Succeeded
  Package Repositiory Versions:
    Jaeger - Product:  3.0.0
Events:                <none>


Expected results:
The jaeger operator can be deployed in 4.1

Additional info:

Comment 1 Anping Li 2019-12-03 06:46:11 UTC
Created attachment 1641560 [details]
csc pod manifest and catalog operators logs

Comment 2 Anping Li 2019-12-03 07:17:23 UTC
There are similar error in the csc pod cluster-logging-operator-75597bc855-6x2qb/installed-custom-openshift-ansible-service-broker-59664cdft6dtr, community-operators-556477f67f-6wtbb and  redhat-operators-78d75587b5-fsjhm . the openshift-ansible-service-broker operator couldn't be deployed too.

[anli@preserve-docker-slave ocp41_73053]$ oc logs cluster-logging-operator-75597bc855-6x2qb
time="2019-12-03T06:04:55Z" level=info msg="Using in-cluster kube client config" port=50051 type=appregistry
time="2019-12-03T06:04:55Z" level=info msg="operator source(s) specified are - [https://quay.io/cnr|redhat-operators]" port=50051 type=appregistry
time="2019-12-03T06:04:55Z" level=info msg="package(s) specified are - cluster-logging" port=50051 type=appregistry
time="2019-12-03T06:04:55Z" level=info msg="input has been sanitized" port=50051 type=appregistry
time="2019-12-03T06:04:55Z" level=info msg="sources: [https://quay.io/cnr/redhat-operators]" port=50051 type=appregistry
time="2019-12-03T06:04:55Z" level=info msg="packages: [cluster-logging]" port=50051 type=appregistry
time="2019-12-03T06:04:55Z" level=info msg="resolved the following packages: [redhat-operators/cluster-logging:18.0.0]" port=50051 type=appregistry
time="2019-12-03T06:04:55Z" level=info msg="downloading repository: redhat-operators/cluster-logging:18.0.0 from https://quay.io/cnr" port=50051 type=appregistry
time="2019-12-03T06:04:56Z" level=info msg="download complete - 1 repositories have been downloaded" port=50051 type=appregistry
time="2019-12-03T06:04:56Z" level=info msg="decoding the downloaded operator manifest(s)" port=50051 type=appregistry
time="2019-12-03T06:04:56Z" level=info msg="manifest format is - nested" port=50051 repository="redhat-operators/cluster-logging:18.0.0" type=appregistry
time="2019-12-03T06:04:56Z" level=info msg="downloaded/cluster-logging-w7o263kt - type=directory" port=50051 type=appregistry
time="2019-12-03T06:04:56Z" level=info msg="downloaded/cluster-logging-w7o263kt/4.1 - type=directory" port=50051 type=appregistry
time="2019-12-03T06:04:56Z" level=info msg="downloaded/cluster-logging-w7o263kt/4.1/cluster-logging.v4.1.0.clusterserviceversion.yaml - type=file" port=50051 type=appregistry
time="2019-12-03T06:04:56Z" level=info msg="downloaded/cluster-logging-w7o263kt/4.1/cluster-loggings.crd.yaml - type=file" port=50051 type=appregistry
time="2019-12-03T06:04:56Z" level=info msg="downloaded/cluster-logging-w7o263kt/4.2 - type=directory" port=50051 type=appregistry
time="2019-12-03T06:04:56Z" level=info msg="downloaded/cluster-logging-w7o263kt/4.2/cluster-logging.v4.2.0.clusterserviceversion.yaml - type=file" port=50051 type=appregistry
time="2019-12-03T06:04:56Z" level=info msg="downloaded/cluster-logging-w7o263kt/4.2/cluster-loggings.crd.yaml - type=file" port=50051 type=appregistry
time="2019-12-03T06:04:56Z" level=info msg="downloaded/cluster-logging-w7o263kt/cluster-logging.package.yaml - type=file" port=50051 type=appregistry
time="2019-12-03T06:04:56Z" level=info msg="decoded successfully" port=50051 repository="redhat-operators/cluster-logging:18.0.0" type=appregistry
time="2019-12-03T06:04:56Z" level=info msg="decoded 0 flattened and 1 nested operator manifest(s)" port=50051 type=appregistry
time="2019-12-03T06:04:56Z" level=info msg="loading nested operator bundle(s) from downloaded into sqlite" port=50051 type=appregistry
time="2019-12-03T06:04:56Z" level=info msg="Loading bundles" dir=downloaded
time="2019-12-03T06:04:56Z" level=info msg=directory dir=downloaded file=downloaded load=bundles
time="2019-12-03T06:04:56Z" level=info msg=directory dir=downloaded file=cluster-logging-w7o263kt load=bundles
time="2019-12-03T06:04:56Z" level=info msg=directory dir=downloaded file=4.1 load=bundles
time="2019-12-03T06:04:56Z" level=info msg="found csv, loading bundle" dir=downloaded file=cluster-logging.v4.1.0.clusterserviceversion.yaml load=bundles
time="2019-12-03T06:04:56Z" level=info msg="loading bundle file" dir=downloaded file=cluster-logging.v4.1.0.clusterserviceversion.yaml load=bundle
time="2019-12-03T06:04:56Z" level=info msg="loading bundle file" dir=downloaded file=cluster-loggings.crd.yaml load=bundle
time="2019-12-03T06:04:56Z" level=info msg=directory dir=downloaded file=4.2 load=bundles
time="2019-12-03T06:04:56Z" level=info msg="found csv, loading bundle" dir=downloaded file=cluster-logging.v4.2.0.clusterserviceversion.yaml load=bundles
time="2019-12-03T06:04:56Z" level=info msg="loading bundle file" dir=downloaded file=cluster-logging.v4.2.0.clusterserviceversion.yaml load=bundle
time="2019-12-03T06:04:56Z" level=info msg="loading bundle file" dir=downloaded file=cluster-loggings.crd.yaml load=bundle
time="2019-12-03T06:04:56Z" level=info msg="could not decode contents of file downloaded/cluster-logging-w7o263kt/cluster-logging.package.yaml into csv: error unmarshaling JSON: Object 'Kind' is missing in '{\"channels\":[{\"currentCSV\":\"clusterlogging.4.1.25-201911190028\",\"name\":\"preview\"},{\"currentCSV\":\"clusterlogging.4.2.8-201911190952\",\"name\":\"4.2\"}],\"defaultChannel\":\"4.2\",\"packageName\":\"cluster-logging\"}'" dir=downloaded file=cluster-logging.package.yaml load=bundles
time="2019-12-03T06:04:56Z" level=info msg="Loading packages and entries" dir=downloaded
time="2019-12-03T06:04:56Z" level=info msg=directory dir=downloaded file=downloaded load=package
time="2019-12-03T06:04:56Z" level=info msg=directory dir=downloaded file=cluster-logging-w7o263kt load=package
time="2019-12-03T06:04:56Z" level=info msg=directory dir=downloaded file=4.1 load=package
time="2019-12-03T06:04:56Z" level=info msg=directory dir=downloaded file=4.2 load=package
time="2019-12-03T06:04:56Z" level=info msg="Extracting provided API information" dir=downloaded
time="2019-12-03T06:04:56Z" level=info msg="serving registry" port=50051 type=appregistry

Comment 3 Jian Zhang 2019-12-03 07:38:56 UTC
The cluster network works well, but that pods(installed-redhat-openshift-operators-55f75f7dc8-7vctk) doesn't work well.

mac:~ jianzhang$ oc get pods -n openshift-marketplace -o wide
NAME                                                       READY   STATUS    RESTARTS   AGE     IP            NODE                                         NOMINATED NODE   READINESS GATES
certified-operators-5c755d7664-wz79w                       1/1     Running   0          3h9m    10.129.2.6    ip-10-0-143-220.us-east-2.compute.internal   <none>           <none>
cluster-logging-operator-5cc95f8dd9-m6qcn                  1/1     Running   0          6m25s   10.128.2.27   ip-10-0-175-24.us-east-2.compute.internal    <none>           <none>
community-operators-556477f67f-6wtbb                       1/1     Running   0          3h9m    10.129.2.5    ip-10-0-143-220.us-east-2.compute.internal   <none>           <none>
elasticsearch-775cdd5dcb-ms4nq                             1/1     Running   0          6m25s   10.128.2.28   ip-10-0-175-24.us-east-2.compute.internal    <none>           <none>
installed-community-openshift-operators-64bd5849c6-8chkd   1/1     Running   0          72s     10.131.0.31   ip-10-0-149-237.us-east-2.compute.internal   <none>           <none>
installed-redhat-openshift-operators-55f75f7dc8-7vctk      1/1     Running   0          6m24s   10.129.2.23   ip-10-0-143-220.us-east-2.compute.internal   <none>           <none>
marketplace-operator-5b8cf489b6-zhlf4                      1/1     Running   0          3h9m    10.129.0.26   ip-10-0-162-42.us-east-2.compute.internal    <none>           <none>
redhat-operators-6cffbfd58d-6mtl2                          1/1     Running   0          6m24s   10.131.0.28   ip-10-0-149-237.us-east-2.compute.internal   <none>           <none>

As we can see, in the same node, the pod certified-operators-5c755d7664-wz79w (10.129.2.6) can respond well, but that installed-redhat-openshift-operators-55f75f7dc8-7vctk (10.129.0.23) failed to response.
mac:~ jianzhang$ oc rsh -n openshift-sdn sdn-zqvfk
sh-4.2# ping 10.129.0.23
PING 10.129.0.23 (10.129.0.23) 56(84) bytes of data.
From 10.130.0.1 icmp_seq=1 Destination Host Unreachable
From 10.130.0.1 icmp_seq=2 Destination Host Unreachable
From 10.130.0.1 icmp_seq=3 Destination Host Unreachable
From 10.130.0.1 icmp_seq=4 Destination Host Unreachable
From 10.130.0.1 icmp_seq=5 Destination Host Unreachable
From 10.130.0.1 icmp_seq=6 Destination Host Unreachable
From 10.130.0.1 icmp_seq=7 Destination Host Unreachable
From 10.130.0.1 icmp_seq=8 Destination Host Unreachable
From 10.130.0.1 icmp_seq=9 Destination Host Unreachable
^c

sh-4.2# ping 10.129.2.6 
PING 10.129.2.6 (10.129.2.6) 56(84) bytes of data.
64 bytes from 10.129.2.6: icmp_seq=1 ttl=64 time=1.48 ms
64 bytes from 10.129.2.6: icmp_seq=2 ttl=64 time=0.748 ms
64 bytes from 10.129.2.6: icmp_seq=3 ttl=64 time=0.753 ms
exi64 bytes from 10.129.2.6: icmp_seq=4 ttl=64 time=0.744 ms
...


But, we didn't get any errors and warnings from that pod logs. Only get the info message:
time="2019-12-03T05:40:51Z" level=info msg="could not decode contents of file downloaded/jaeger-product-aiiua3zr/jaeger.package.yaml into csv: error unmarshaling JSON: Object 'Kind' is missing in '{\"channels\":[{\"currentCSV\":\"jaeger-operator.v1.13.1\",\"name\":\"stable\"}],\"packageName\":\"jaeger-product\"}'" dir=downloaded file=jaeger.package.yaml load=bundles
time="2019-12-03T05:40:51Z" level=info msg="Loading packages and entries" dir=downloaded

So, here, I think we should fix two problems:
1, the above info: "Object 'Kind' is missing". I think there is no `Kind` object in the package. It's very confusing!
2, If the pod fails to respond, please print the error message.

Comment 5 Jian Zhang 2019-12-03 10:06:36 UTC
One more question, the database is empty, is it as expected?
mac:~ jianzhang$ oc rsh installed-redhat-openshift-operators-565cdc5448-cvb9n 
...
sh-4.2$ ls
bundles.db  downloaded
sh-4.2$ 
sh-4.2$ 
sh-4.2$ sqlite3 bundles.db 
SQLite version 3.7.17 2013-05-20 00:56:22
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .table
sqlite>

Comment 10 Anping Li 2019-12-04 02:15:43 UTC
By the ways, the community jaeger can be deployed in this cluster.


Note You need to log in before you can comment on or make changes to this bug.