Bug 1706232
| Summary: | Recreating a CatalogSource and subscription for something from that catalog source results in 'stuck' subscription | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Paul Morie <pmorie> |
| Component: | OLM | Assignee: | Evan Cordell <ecordell> |
| OLM sub component: | OLM | QA Contact: | Cuiping HUO <chuo> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | chuo, eparis, jiazha |
| Version: | 4.1.0 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.1.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-06-04 10:48:23 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Paul Morie
2019-05-03 20:50:01 UTC
Paul,
Thanks for your reporting! I could NOT reproduce this issue with the below version:
OLM version: io.openshift.build.commit.id=b2d1cd21368bc8cc10e4ca11a231f09077630c33
Cluster version is 4.1.0-0.nightly-2019-05-06-011159
1, Create a new project called "debug" and install the "AMQ" operator in it.
mac:~ jianzhang$ oc get pods
NAME READY STATUS RESTARTS AGE
amq-streams-cluster-operator-779f9ffbd4-dfz69 1/1 Running 0 8m28s
2, Delete the subscription and catalog source.
3, Recreate the subscription and catalog source.
4, Check the status of the subscription. It works well, as below:
mac:~ jianzhang$ oc get sub
NAME PACKAGE SOURCE CHANNEL
amq-streams amq-streams installed-redhat-debug stable
mac:~ jianzhang$ oc get sub amq-streams -o go-template='{{ .status.state }}'
AtLatestKnownmac:~ jianzhang$
Could you help share me with the details steps to reproduce this issue? Thanks!
I was finally able to reproduce this by running the same test multiple times in a cluster - thanks for the report! This is fixed in this commit: https://github.com/operator-framework/operator-lifecycle-manager/pull/846/commits/8d9664a6e3ecbf5615a1e74911a6a87efb11e998 (may go in a different PR depending on how other PRs merge) After this change, I can no longer reproduce the stuck subscription bug. Proposed fix doesn't address the issue. This PR contains the fix for the issue: https://github.com/operator-framework/operator-lifecycle-manager/pull/847 *** Bug 1704940 has been marked as a duplicate of this bug. *** Verified Failed with the below version: OLM version: io.openshift.build.commit.id=19e7914e33f723c6f77f7aaa0892c7684ce94ed4 Cluster version: 4.1.0-0.nightly-2019-05-09-182710 1, Install the "etcd" operator in project "default". [chuo@dhcp-140-165 .kube]$ oc get sub NAME PACKAGE SOURCE CHANNEL couchbase-enterprise-certified couchbase-enterprise-certified installed-certified-default preview etcd etcd installed-community-default singlenamespace-alpha [chuo@dhcp-140-165 .kube]$ oc get catsrc NAME NAME TYPE PUBLISHER AGE installed-certified-default Certified Operators grpc Certified 38m installed-community-default Community Operators grpc Community 46s [chuo@dhcp-140-165 .kube]$ oc get ip NAME CSV SOURCE APPROVAL APPROVED install-cgsm4 couchbase-operator.v1.1.0 Automatic true install-g7wmf etcdoperator.v0.9.4 Manual false 2, Manual approved ip [chuo@dhcp-140-165 .kube]$ oc edit ip install-g7wmf installplan.operators.coreos.com/install-g7wmf edited [chuo@dhcp-140-165 .kube]$ oc get ip NAME CSV SOURCE APPROVAL APPROVED install-g7wmf etcdoperator.v0.9.4 Manual true [chuo@dhcp-140-165 .kube]$ oc get csv NAME DISPLAY VERSION REPLACES PHASE etcdoperator.v0.9.4 etcd 0.9.4 etcdoperator.v0.9.2 Succeeded 3, delete the subscription and catalog source. [chuo@dhcp-140-165 .kube]$ oc delete catsrc installed-community-default catalogsource.operators.coreos.com "installed-community-default" deleted [chuo@dhcp-140-165 .kube]$ oc delete sub etcd subscription.operators.coreos.com "etcd" deleted [chuo@dhcp-140-165 .kube]$ oc get catsrc NAME NAME TYPE PUBLISHER AGE installed-certified-default Certified Operators grpc Certified 46m [chuo@dhcp-140-165 .kube]$ oc get sub NAME PACKAGE SOURCE CHANNEL couchbase-enterprise-certified couchbase-enterprise-certified installed-certified-default preview [chuo@dhcp-140-165 .kube]$ oc get ip NAME CSV SOURCE APPROVAL APPROVED install-cgsm4 couchbase-operator.v1.1.0 Automatic true install-g7wmf etcdoperator.v0.9.4 Manual true subscription can be deleted from the back end, but ip still exsits, meanwhile from Webconsole-Installed Operators(for Project "default") etcd operator exists with status "InstallSucceeded". Verification success with the below version:
OLM version: io.openshift.build.commit.id=19e7914e33f723c6f77f7aaa0892c7684ce94ed4
Cluster version: 4.1.0-0.nightly-2019-05-09-182710
1, Install the "etcd" operator in project "test".
[chuo@dhcp-140-165 .kube]$ oc get sub
NAME PACKAGE SOURCE CHANNEL
etcd etcd installed-community-test singlenamespace-alpha
[chuo@dhcp-140-165 .kube]$ oc get catsrc
NAME NAME TYPE PUBLISHER AGE
installed-community-test Community Operators grpc Community 2m16s
[chuo@dhcp-140-165 .kube]$ oc get ip
NAME CSV SOURCE APPROVAL APPROVED
install-7wnvx etcdoperator.v0.9.4 Automatic true
2, delete the subscription and catalog source.
3, re-create subscription and catlogsource
[chuo@dhcp-140-165 .kube]$ oc get sub
NAME PACKAGE SOURCE CHANNEL
etcd etcd installed-community-test singlenamespace-alpha
[chuo@dhcp-140-165 .kube]$ oc get catsrc
NAME NAME TYPE PUBLISHER AGE
installed-community-test Community Operators grpc Community 2m16s
[chuo@dhcp-140-165 .kube]$ oc get ip
NAME CSV SOURCE APPROVAL APPROVED
install-7wnvx etcdoperator.v0.9.4 Automatic true
4, repeat step2 and step 3 for 10 times, subscription success 10 times
5, delete catalog-operator and wait until new pod is running
[chuo@dhcp-140-165 .kube]$ oc delete po catalog-operator-569b689878-g8zzh -n openshift-operator-lifecycle-manager
pod "catalog-operator-569b689878-g8zzh" deleted
[chuo@dhcp-140-165 .kube]$ oc get po -n openshift-operator-lifecycle-manager
NAME READY STATUS RESTARTS AGE
catalog-operator-569b689878-p7c2f 0/1 Running 0 14s
6.re-create subscription and catlogsource
[chuo@dhcp-140-165 .kube]$ oc get sub
NAME PACKAGE SOURCE CHANNEL
etcd etcd installed-community-test singlenamespace-alpha
[chuo@dhcp-140-165 .kube]$ oc get ip
NAME CSV SOURCE APPROVAL APPROVED
install-znv6w etcdoperator.v0.9.4 Automatic true
[chuo@dhcp-140-165 .kube]$ oc get catsrc
NAME NAME TYPE PUBLISHER AGE
installed-community-test Community Operators grpc Community 2m33s
[chuo@dhcp-140-165 .kube]$ oc get sub etcd -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
creationTimestamp: "2019-05-13T08:34:10Z"
generation: 1
labels:
csc-owner-name: installed-community-test
csc-owner-namespace: openshift-marketplace
name: etcd
namespace: test
resourceVersion: "149673"
selfLink: /apis/operators.coreos.com/v1alpha1/namespaces/test/subscriptions/etcd
uid: e1912e81-7559-11e9-8544-0aa12d6c2fce
spec:
channel: singlenamespace-alpha
installPlanApproval: Automatic
name: etcd
source: installed-community-test
sourceNamespace: test
startingCSV: etcdoperator.v0.9.4
status:
currentCSV: etcdoperator.v0.9.4
installPlanRef:
apiVersion: operators.coreos.com/v1alpha1
kind: InstallPlan
name: install-znv6w
namespace: test
resourceVersion: "149643"
uid: e2095587-7559-11e9-8bba-02c4299c1f3a
installedCSV: etcdoperator.v0.9.4
installplan:
apiVersion: operators.coreos.com/v1alpha1
kind: InstallPlan
name: install-znv6w
uuid: e2095587-7559-11e9-8bba-02c4299c1f3a
lastUpdated: "2019-05-13T08:34:14Z"
state: AtLatestKnown
[chuo@dhcp-140-165 .kube]$ oc get catsrc installed-community-test -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
creationTimestamp: "2019-05-13T08:34:02Z"
generation: 1
labels:
csc-owner-name: installed-community-test
csc-owner-namespace: openshift-marketplace
name: installed-community-test
namespace: test
resourceVersion: "152087"
selfLink: /apis/operators.coreos.com/v1alpha1/namespaces/test/catalogsources/installed-community-test
uid: dcdf4ca8-7559-11e9-b532-06d8365d7bd0
spec:
address: 172.30.208.199:50051
displayName: Community Operators
icon:
base64data: ""
mediatype: ""
publisher: Community
sourceType: grpc
status:
lastSync: "2019-05-13T08:39:09Z"
registryService:
createdAt: "2019-05-13T08:39:07Z"
protocol: grpc
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0758 |