Bug 1925113

Summary: Camel K Operator CSV stays stuck in Pending status
Product: OpenShift Container Platform Reporter: Lukas Lowinger <llowinge>
Component: OLMAssignee: Evan Cordell <ecordell>
OLM sub component: OLM QA Contact: Jian Zhang <jiazha>
Status: CLOSED DUPLICATE Docs Contact:
Severity: medium    
Priority: medium CC: astefanu, krizza, llowinge, maschmid
Version: 4.6.zKeywords: Triaged
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-25 04:53:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
All pod logs from olm namespace
none
CSV file
none
Install plan
none
Subscription
none
crd-all-v1.txt
none
crd-all-v2.txt none

Description Lukas Lowinger 2021-02-04 12:32:15 UTC
Created attachment 1755051 [details]
All pod logs from olm namespace

Created attachment 1755051 [details]
All pod logs from olm namespace

Description of problem:

We sometimes hit very intermittent issue while installing Camel K Operator via OLM. I've attached all necessary logs, but i can provide more if needed. I can reproduce on our cluster easily.

Version-Release number of selected component (if applicable):
OCP 4.6.12

How reproducible:

In the loop we trigger olm installation of Camel K Operator and after few (usually 1-2) success attempts it stops and stays stuck. Note i've tried in fresh CRC and it was working.

Steps to Reproduce:

1. for i in $(seq 20 1); do namespace="camel-k-installation-test$i"; oc new-project $namespace; sleep $i; /Users/llowinge/Redhat/camel-k/kamel install -w -n $namespace --olm-source=camel-k-upstream-source --olm-channel=alpha; if ! timeout 5m bash -c -- "until oc get -n $namespace pods | grep -qi running; do sleep 5; done"; then echo "Timeout waiting for camel-k-operator to be in Running phase"; exit 1;fi; oc delete project $namespace; done
2. Check csv in Pending state after a while
3.

Actual results:

oc get csv
NAME                          DISPLAY                        VERSION   REPLACES                      PHASE
camel-k.v1.4.0                Camel K Operator               1.4.0                                   Pending

Expected results:

CSV not in Pending status.

Additional info:

Comment 1 Lukas Lowinger 2021-02-04 12:35:05 UTC
Created attachment 1755055 [details]
CSV file

Comment 2 Lukas Lowinger 2021-02-04 12:35:36 UTC
Created attachment 1755056 [details]
Install plan

Comment 3 Lukas Lowinger 2021-02-04 12:36:12 UTC
Created attachment 1755058 [details]
Subscription

Comment 4 Evan Cordell 2021-02-04 19:27:14 UTC
It looks like the immediate issue is that the CRD failed to install. In this case the error looks transient, so attempting to install again would work. Right now, the installplan does not retry except in very specific circumstances - I think the fix for this is to notice that this is a transient issue and the installplan should be automatically retried.

Comment 6 Kevin Rizza 2021-02-04 20:07:23 UTC
Moving this bz to medium as it is intermittent and does not look like it has an immediate impact to production users along with a fairly straightforward workaround. Agreed with Evan, this appears that retry logic should be able to resolve this bug.

Comment 7 Lukas Lowinger 2021-02-08 14:26:58 UTC
When i've deleted all Camel K CRDs (+ restarting all olm pods) and run my test script, it was working. I've then tried to remove all CRDs again, first install older CRDs and tried installation - OK. After that i've run script with installation of new CRDs on top of old ones and the problem occurred in 4th attempt. Adding both CRDs yamls.

Comment 8 Lukas Lowinger 2021-02-08 14:29:03 UTC
Created attachment 1755731 [details]
crd-all-v1.txt

Comment 9 Lukas Lowinger 2021-02-08 14:29:27 UTC
Created attachment 1755732 [details]
crd-all-v2.txt

Comment 10 Ben Luddy 2021-02-25 04:53:53 UTC
Based on the attached InstallPlan and CSV statuses, this appears to be an instance of the issue tracked in https://bugzilla.redhat.com/show_bug.cgi?id=1923111. Since this issue is newer, I'm marking it as a duplicate of the older issue so that both reports can be addressed together.

*** This bug has been marked as a duplicate of bug 1923111 ***