Bug 1771638 - [OLM] cannot install the dependency operator
Summary: [OLM] cannot install the dependency operator
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.1.z
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.1.z
Assignee: Anik
QA Contact: Jian Zhang
URL:
Whiteboard:
Depends On: 1775327
Blocks: 1763750
TreeView+ depends on / blocked
 
Reported: 2019-11-12 17:51 UTC by Alexander Greene
Modified: 2020-02-18 13:38 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1775327 (view as bug list)
Environment:
Last Closed: 2020-02-18 13:38:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Alexander Greene 2019-11-12 17:51:04 UTC
Description of problem:

Operators that can be installed on 4.3 are not installing on 4.1.z 

Version-Release number of selected component (if applicable):
4.1.22


How reproducible:
Can reproduce on specific operators.

Steps to Reproduce:
1. Install an OpenShift 4.1.22 cluster
2. Attempt to install the service mesh operator provided by the Redhat-Operators OperatorSource.

Actual results:
The subscription is created but the operator a CSV is not created - as a result no operator is deployed.

Expected results:
The service mesh operator is successfully installed.

Additional info:
Originally created when investigating https://bugzilla.redhat.com/show_bug.cgi?id=1763750#c2

Comment 6 Anik 2019-11-27 16:53:03 UTC
The servicemesh operator depends on a number of operator, and the catalog containing those operators need to be present in the namespace that servicemesh is being installed in. 

To install the servicemesh operator
1) Click install in the UI
2) Copy over the catalogsource to the openshift-operators namespace
3) Check for running servicemesh operator pods in the openshift-operators namespace 


As such, this is working as intended in 4.1.z clusters.

Comment 9 Jian Zhang 2019-12-05 09:01:02 UTC
Hi, Nick

Thanks for your explanation! 

> When using the console in 4.1, any Subscription to an operator that has any dependencies will not be installed, this is expected behavior...

So, that means we don't support the ServiceMesh operator in 4.1, right?

I have a fresh 4.1.z cluster. And, the ServiceMesh doesn't work.
Cluster version is 4.1.0-0.nightly-2019-12-04-071458

mac:~ jianzhang$ oc get sub -n openshift-operators
NAME                  PACKAGE               SOURCE                                 CHANNEL
servicemeshoperator   servicemeshoperator   installed-redhat-openshift-operators   1.0
mac:~ jianzhang$ oc get csv -n openshift-operators
No resources found in openshift-operators namespace.
mac:~ jianzhang$ oc get csc -n openshift-marketplace
NAME                                   STATUS      MESSAGE                                       AGE
certified-operators                    Succeeded   The object has been successfully reconciled   56m
community-operators                    Succeeded   The object has been successfully reconciled   56m
installed-redhat-openshift-operators   Succeeded   The object has been successfully reconciled   15m
redhat-operators                       Succeeded   The object has been successfully reconciled   56m

mac:~ jianzhang$ oc get catalogsource -n openshift-operators
NAME                                   NAME                TYPE   PUBLISHER   AGE
installed-redhat-openshift-operators   Red Hat Operators   grpc   Red Hat     15m


Catalog operator logs:
mac:~ jianzhang$ oc logs catalog-operator-5b79fb7c46-2gr9r 
...
time="2019-12-05T08:29:52Z" level=info msg="retrying openshift-operators"
E1205 08:29:52.952693       1 queueinformer_operator.go:186] Sync "openshift-operators" failed: {servicemeshoperator 1.0 servicemeshoperator.v1.0.2 {installed-redhat-openshift-operators openshift-operators}} not found: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 172.30.175.186:50051: connect: no route to host"

@Nick, In the above logs, why here are the `openshift-operators`? I think it should be `openshift-marketplace`, right?

mac:~ jianzhang$ oc get svc -n openshift-marketplace
NAME                                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)     AGE
certified-operators                    ClusterIP   172.30.230.118   <none>        50051/TCP   48m
community-operators                    ClusterIP   172.30.189.98    <none>        50051/TCP   48m
installed-redhat-openshift-operators   ClusterIP   172.30.175.186   <none>        50051/TCP   7m24s
redhat-operators                       ClusterIP   172.30.130.232   <none>        50051/TCP   48m
mac:~ jianzhang$ oc get pods -n openshift-marketplace -o wide
NAME                                                   READY   STATUS    RESTARTS   AGE     IP            NODE                           NOMINATED NODE   READINESS GATES
certified-operators-597f9c4c74-bvgdl                   1/1     Running   0          44m     10.128.2.6    ip-10-0-142-202.ec2.internal   <none>           <none>
community-operators-67849c6d56-47r5x                   1/1     Running   0          44m     10.128.2.7    ip-10-0-142-202.ec2.internal   <none>           <none>
installed-redhat-openshift-operators-768b56448-9bm8r   1/1     Running   0          3m25s   10.129.2.8    ip-10-0-136-31.ec2.internal    <none>           <none>
marketplace-operator-849b56d765-hrjg8                  1/1     Running   0          44m     10.128.0.15   ip-10-0-147-113.ec2.internal   <none>           <none>
redhat-operators-785b67d7c4-gc7vg                      1/1     Running   0          44m     10.128.2.3    ip-10-0-142-202.ec2.internal   <none>           <none>

mac:~ jianzhang$ oc -n openshift-sdn ovs-87v55
Error: unknown command "ovs-87v55" for "oc"
Run 'oc --help' for usage.
mac:~ jianzhang$ oc -n openshift-sdn rsh ovs-87v55
sh-4.2# ping 10.129.2.8
PING 10.129.2.8 (10.129.2.8) 56(84) bytes of data.
64 bytes from 10.129.2.8: icmp_seq=1 ttl=64 time=0.905 ms
64 bytes from 10.129.2.8: icmp_seq=2 ttl=64 time=0.208 ms
64 bytes from 10.129.2.8: icmp_seq=3 ttl=64 time=0.214 ms
64 bytes from 10.129.2.8: icmp_seq=4 ttl=64 time=0.222 ms
64 bytes from 10.129.2.8: icmp_seq=5 ttl=64 time=0.211 ms

mac:~ jianzhang$ oc rsh catalog-operator-5b79fb7c46-2gr9r 
sh-4.2$ nslookup 172.30.175.186
186.175.30.172.in-addr.arpa	name = installed-redhat-openshift-operators.openshift-marketplace.svc.cluster.local.

That `installed-redhat-openshift-operators-768b56448-9bm8r` pod and DNS works well, why get the `Error while dialing dial tcp 172.30.175.186:50051: connect: no route to host`?


Note You need to log in before you can comment on or make changes to this bug.