Created attachment 1755051 [details] All pod logs from olm namespace Created attachment 1755051 [details] All pod logs from olm namespace Description of problem: We sometimes hit very intermittent issue while installing Camel K Operator via OLM. I've attached all necessary logs, but i can provide more if needed. I can reproduce on our cluster easily. Version-Release number of selected component (if applicable): OCP 4.6.12 How reproducible: In the loop we trigger olm installation of Camel K Operator and after few (usually 1-2) success attempts it stops and stays stuck. Note i've tried in fresh CRC and it was working. Steps to Reproduce: 1. for i in $(seq 20 1); do namespace="camel-k-installation-test$i"; oc new-project $namespace; sleep $i; /Users/llowinge/Redhat/camel-k/kamel install -w -n $namespace --olm-source=camel-k-upstream-source --olm-channel=alpha; if ! timeout 5m bash -c -- "until oc get -n $namespace pods | grep -qi running; do sleep 5; done"; then echo "Timeout waiting for camel-k-operator to be in Running phase"; exit 1;fi; oc delete project $namespace; done 2. Check csv in Pending state after a while 3. Actual results: oc get csv NAME DISPLAY VERSION REPLACES PHASE camel-k.v1.4.0 Camel K Operator 1.4.0 Pending Expected results: CSV not in Pending status. Additional info:
Created attachment 1755055 [details] CSV file
Created attachment 1755056 [details] Install plan
Created attachment 1755058 [details] Subscription
It looks like the immediate issue is that the CRD failed to install. In this case the error looks transient, so attempting to install again would work. Right now, the installplan does not retry except in very specific circumstances - I think the fix for this is to notice that this is a transient issue and the installplan should be automatically retried.
Moving this bz to medium as it is intermittent and does not look like it has an immediate impact to production users along with a fairly straightforward workaround. Agreed with Evan, this appears that retry logic should be able to resolve this bug.
When i've deleted all Camel K CRDs (+ restarting all olm pods) and run my test script, it was working. I've then tried to remove all CRDs again, first install older CRDs and tried installation - OK. After that i've run script with installation of new CRDs on top of old ones and the problem occurred in 4th attempt. Adding both CRDs yamls.
Created attachment 1755731 [details] crd-all-v1.txt
Created attachment 1755732 [details] crd-all-v2.txt
Based on the attached InstallPlan and CSV statuses, this appears to be an instance of the issue tracked in https://bugzilla.redhat.com/show_bug.cgi?id=1923111. Since this issue is newer, I'm marking it as a duplicate of the older issue so that both reports can be addressed together. *** This bug has been marked as a duplicate of bug 1923111 ***