Bug 1714140
| Summary: | [4.2]The generated InstallPlan object didn't refer to itself subscription object | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Jian Zhang <jiazha> | ||||
| Component: | OLM | Assignee: | Evan Cordell <ecordell> | ||||
| OLM sub component: | OLM | QA Contact: | Jian Zhang <jiazha> | ||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||
| Severity: | medium | ||||||
| Priority: | medium | CC: | bandrade, chezhang, chuo, dyan, jfan, jpeeler, scolange | ||||
| Version: | 4.1.z | ||||||
| Target Milestone: | --- | ||||||
| Target Release: | 4.2.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 1740174 (view as bug list) | Environment: | |||||
| Last Closed: | 2019-10-16 06:29:21 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1740174 | ||||||
| Attachments: |
|
||||||
|
Description
Jian Zhang
2019-05-27 08:40:43 UTC
Created attachment 1573849 [details]
The whole catalog-operator logs
Although it occurs not often, it will block the users to install the operators. Increased Severity. I believe I have seen this before, and I wrote down what I thought was a description of the bug: - when you uninstall an operator and remove the subscription, the installplan for that subscription has an ownerreference pointing back to the subscription. - if you create a new subscription for the same operator before kube GCs the installplan, we detect it as farther along in the install process (because the installplan exists) and just sit, waiting for the operator to start up (but it never does because we skipped that part) The mitigation for this is to wait for the installplans to GC before re-installing an operator. Because there is a workaround, I'm moving down to medium severity. Moved to 4.1.z so that we can work on a fix for this. Evan, Jeff I think this fixed PR(https://github.com/operator-framework/operator-lifecycle-manager/pull/965) only merged in the master branch, not the release-4.1. Could you help cherry-pick it to release-4.1 branch? Thanks! Verify failed since no fixed PR for 4.1.z version. The 4.1 version of this in progress. Targeting this to 4.2, will clone for 4.1 Evan, > Targeting this to 4.2, will clone for 4.1 Ok, create bug 1740491 for 4.1.z version. LGTM, verify it, detail steps as below:
Cluster version is 4.2.0-0.nightly-2019-08-13-183722
OLM version:
mac:~ jianzhang$ oc exec catalog-operator-5cc66dd5c4-7sw95 -- olm --version
OLM version: 0.11.0
git commit: 586e941bd1f42ea1f331453ed431fb43699fef70
1. Install the Couchbase, anchore-engine and TiDB operators, select the "All Namespaces ..." option.
mac:~ jianzhang$ oc get ip
NAME CSV SOURCE APPROVAL APPROVED
install-jlwm7 couchbase-operator.v1.1.0 Automatic true
install-s546g anchore-engine-operator.v0.0.2 Automatic true
install-zjd7j tidb-operator.1.0.0-beta1 Automatic true
mac:~ jianzhang$ oc get pods
NAME READY STATUS RESTARTS AGE
anchore-engine-operator-5b55589d6f-829md 1/1 Running 0 10m
couchbase-operator-79b995b87d-t8qvh 1/1 Running 0 10m
tidb-controller-manager-7546d898df-6zzbk 1/1 Running 0 4m55s
2. Remove the Couchbase and anchore-engine operators.
3. Install the anchore-engine operator again, select the "All Namespaces ..." option.
mac:~ jianzhang$ oc get ip
NAME CSV SOURCE APPROVAL APPROVED
install-6dkxn anchore-engine-operator.v0.0.2 Automatic true
install-zjd7j tidb-operator.1.0.0-beta1 Automatic true
mac:~ jianzhang$ oc get pods
NAME READY STATUS RESTARTS AGE
anchore-engine-operator-5b55589d6f-cw5cr 1/1 Running 0 9m6s
tidb-controller-manager-7546d898df-6zzbk 1/1 Running 0 20m
Now, the Anchore operator works well, and the TiDB only refers to itself subscription, others have been deleted.
mac:~ jianzhang$ oc get ip install-zjd7j -o yaml|grep "ownerReferences" -A 10
ownerReferences:
- apiVersion: operators.coreos.com/v1alpha1
blockOwnerDeletion: false
controller: false
kind: Subscription
name: tidb-operator-certified
uid: 9a80fb2d-be76-11e9-874b-022328418248
resourceVersion: "126830"
selfLink: /apis/operators.coreos.com/v1alpha1/namespaces/openshift-operators/installplans/install-zjd7j
uid: 9a8d7a9d-be76-11e9-874b-022328418248
spec:
mac:~ jianzhang$ oc get ip install-6dkxn -o yaml
apiVersion: operators.coreos.com/v1alpha1
kind: InstallPlan
metadata:
creationTimestamp: "2019-08-14T09:45:16Z"
generateName: install-
generation: 1
name: install-6dkxn
namespace: openshift-operators
ownerReferences:
- apiVersion: operators.coreos.com/v1alpha1
blockOwnerDeletion: false
controller: false
kind: Subscription
name: anchore-engine
uid: 388aabb4-be78-11e9-874b-022328418248
- apiVersion: operators.coreos.com/v1alpha1
blockOwnerDeletion: false
controller: false
kind: Subscription
name: tidb-operator-certified
uid: 9a80fb2d-be76-11e9-874b-022328418248
resourceVersion: "127334"
...
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922 |