Bug 1991637 - OpenShift Pipelines upgrade from 1.4 to 1.5 fails
Summary: OpenShift Pipelines upgrade from 1.4 to 1.5 fails
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenShift Pipelines
Classification: Red Hat
Component: Operator
Version: 1.4
Hardware: x86_64
OS: Unspecified
high
urgent
Target Milestone: ---
: ---
Assignee: Nikhil Thomas
QA Contact: Nobody
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-09 14:59 UTC by Novonil Choudhuri
Modified: 2024-10-01 19:10 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-11-25 09:26:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Novonil Choudhuri 2021-08-09 14:59:35 UTC
Description of problem: OpenShift Pipelines upgrade from 1.4 to 1.5 fails 


Version-Release number of selected component (if applicable): OpenShift Pipelines 1.4


How reproducible: Customer environment 


Steps to Reproduce:
1. Upgrade OpenShift 4.7 o 4.8
2. Upgrade OpenShift Pipelines 1.4 to 1.5

Actual results: 

OCP 4.8.3, Pipelines 1.5 upgraded from 1.4


operator deployment goes into CLBO
 state:
      waiting:
        message: back-off 5m0s restarting failed container=openshift-pipelines-operator
          pod=openshift-pipelines-operator-7dbc64877c-xppbv_openshift-operators(7ffaab97-0a36-4bb2-bf54-3f5d1616a17d)
        reason: CrashLoopBackOff


NAMESPACE            NAME                                               READY  STATUS   RESTARTS  AGE  IP           NODE
openshift-operators  openshift-pipelines-operator-7dbc64877c-xppbv      0/1    Running  257       23h  10.128.6.21  worker-3.sb19.caasdev.ford.com


Noted errors in logs:
The commit fails are just info messages though.

openshift-operators/pods/knative-openshift-ingress-68999845bf-bk7f2/knative-openshift-ingress/knative-openshift-ingress/logs/current.log
2021-08-07T18:59:19.567151834Z {"level":"info","ts":"2021-08-07T18:59:19.567Z","caller":"logging/config.go:79","msg":"Fetch GitHub commit ID from kodata failed","error":"\"KO_DATA_PATH\" does not exist or is empty"}


openshift-operators/pods/openshift-pipelines-operator-7dbc64877c-xppbv/openshift-pipelines-operator/openshift-pipelines-operator/logs/current.log
~~~
2021-08-08T18:51:09.323608708Z {"level":"info","caller":"logging/config.go:116","msg":"Successfully created the logger."}
2021-08-08T18:51:09.323608708Z {"level":"info","caller":"logging/config.go:117","msg":"Logging level set to: debug"}
2021-08-08T18:51:09.323664919Z {"level":"info","caller":"logging/config.go:79","msg":"Fetch GitHub commit ID from kodata failed","error":"open /kodata/HEAD: no such file or directory"}                                             <--- GitHUb commit failed
2021-08-08T18:51:09.323697312Z {"level":"info","logger":"tekton-operator","caller":"profiling/server.go:64","msg":"Profiling enabled: false","knative.dev/pod":"openshift-pipelines-operator-7dbc64877c-xppbv"}
2021-08-08T18:51:09.327798972Z {"level":"info","logger":"tekton-operator","caller":"leaderelection/context.go:46","msg":"Running with Standard leader election","knative.dev/pod":"openshift-pipelines-operator-7dbc64877c-xppbv"}
2021-08-08T18:51:10.378036614Z I0808 18:51:10.377988       1 request.go:645] Throttling request took 1.045580083s, request: GET:https://172.30.0.1:443/apis/storage.k8s.io/v1?timeout=32s
2021-08-08T18:51:13.080424610Z {"level":"info","logger":"tekton-operator.manifestival","caller":"manifestival/manifestival.go:72","msg":"Parsing manifest","knative.dev/pod":"openshift-pipelines-operator-7dbc64877c-xppbv"}
2021-08-08T18:51:16.833261445Z {"level":"info","logger":"tekton-operator.manifestival","caller":"manifestival/manifestival.go:72","msg":"Parsing manifest","knative.dev/pod":"openshift-pipelines-operator-7dbc64877c-xppbv"}
2021-08-08T18:51:16.833333748Z {"level":"debug","logger":"tekton-operator","caller":"tektonpipeline/controller.go:132","msg":"Creating event broadcaster","knative.dev/pod":"openshift-pipelines-operator-7dbc64877c-xppbv"}
2021-08-08T18:51:16.833380259Z {"level":"info","logger":"tekton-operator","caller":"tektonpipeline/controller.go:73","msg":"Setting up event handlers","knative.dev/pod":"openshift-pipelines-operator-7dbc64877c-xppbv"}
2021-08-08T18:51:20.382039207Z I0808 18:51:20.381988       1 request.go:645] Throttling request took 3.542814064s, request: GET:https://172.30.0.1:443/apis/snapshot.storage.k8s.io/v1beta1?timeout=32s
2021-08-08T18:51:20.596979479Z {"level":"info","logger":"tekton-operator.manifestival","caller":"manifestival/manifestival.go:72","msg":"Parsing manifest","knative.dev/pod":"openshift-pipelines-operator-7dbc64877c-xppbv"}
2021-08-08T18:51:20.597037236Z {"level":"debug","logger":"tekton-operator","caller":"tektontrigger/controller.go:132","msg":"Creating event broadcaster","knative.dev/pod":"openshift-pipelines-operator-7dbc64877c-xppbv"}
2021-08-08T18:51:20.597079632Z {"level":"info","logger":"tekton-operator","caller":"tektontrigger/controller.go:76","msg":"Setting up event handlers","knative.dev/pod":"openshift-pipelines-operator-7dbc64877c-xppbv"}
2021-08-08T18:51:24.351280999Z {"level":"info","logger":"tekton-operator.manifestival","caller":"manifestival/manifestival.go:72","msg":"Parsing manifest","knative.dev/pod":"openshift-pipelines-operator-7dbc64877c-xppbv"}
2021-08-08T18:51:24.351342859Z {"level":"debug","logger":"tekton-operator","caller":"tektonaddon/controller.go:132","msg":"Creating event broadcaster","knative.dev/pod":"openshift-pipelines-operator-7dbc64877c-xppbv"}
2021-08-08T18:51:24.351412413Z {"level":"info","logger":"tekton-operator","caller":"tektonaddon/controller.go:79","msg":"Setting up event handlers","knative.dev/pod":"openshift-pipelines-operator-7dbc64877c-xppbv"}
2021-08-08T18:51:28.105510391Z {"level":"info","logger":"tekton-operator.manifestival","caller":"manifestival/manifestival.go:72","msg":"Parsing manifest","knative.dev/pod":"openshift-pipelines-operator-7dbc64877c-xppbv"}
2021-08-08T18:51:30.404027720Z I0808 18:51:30.403980       1 request.go:645] Throttling request took 2.294941969s, request: GET:https://172.30.0.1:443/apis/snapshot.storage.k8s.io/v1?timeout=32s
2021-08-08T18:51:31.856509371Z {"level":"info","logger":"tekton-operator.manifestival","caller":"manifestival/manifestival.go:72","msg":"Parsing manifest","knative.dev/pod":"openshift-pipelines-operator-7dbc64877c-xppbv"}
2021-08-08T18:51:31.856560483Z {"level":"debug","logger":"tekton-operator","caller":"tektonconfig/controller.go:132","msg":"Creating event broadcaster","knative.dev/pod":"openshift-pipelines-operator-7dbc64877c-xppbv"}
2021-08-08T18:51:31.856593505Z {"level":"info","logger":"tekton-operator","caller":"tektonconfig/controller.go:74","msg":"Setting up event handlers","knative.dev/pod":"openshift-pipelines-operator-7dbc64877c-xppbv"}
2021-08-08T18:51:31.856602184Z {"level":"debug","logger":"tekton-operator","caller":"tektonconfig/instance.go:71","msg":"ensuring tektonconfig instance","knative.dev/pod":"openshift-pipelines-operator-7dbc64877c-xppbv"}
2021-08-08T18:51:31.873530540Z panic: The environment variable "METRICS_DOMAIN" is not set                                                                                                                                           <--- Panic 
2021-08-08T18:51:31.873530540Z 
2021-08-08T18:51:31.873530540Z If this is a process running on Kubernetes, then it should be specifying
2021-08-08T18:51:31.873530540Z this via:
2021-08-08T18:51:31.873530540Z 
2021-08-08T18:51:31.873530540Z   env:
2021-08-08T18:51:31.873530540Z   - name: METRICS_DOMAIN
2021-08-08T18:51:31.873530540Z     value: knative.dev/some-repository
2021-08-08T18:51:31.873530540Z 
2021-08-08T18:51:31.873530540Z If this is a Go unit test consuming metric.Domain() then it should add the
2021-08-08T18:51:31.873530540Z following import:
2021-08-08T18:51:31.873530540Z 
2021-08-08T18:51:31.873530540Z import (
2021-08-08T18:51:31.873530540Z  _ "knative.dev/pkg/metrics/testing"
2021-08-08T18:51:31.873530540Z )
2021-08-08T18:51:31.873530540Z 
2021-08-08T18:51:31.873530540Z goroutine 1 [running]:
2021-08-08T18:51:31.873530540Z knative.dev/pkg/metrics.Domain(0xc000d31848, 0x18b80cd)
2021-08-08T18:51:31.873530540Z  /opt/app-root/src/go/src/github.com/tektoncd/operator/vendor/knative.dev/pkg/metrics/config.go:291 +0xfa
2021-08-08T18:51:31.873530540Z knative.dev/pkg/metrics.ConfigMapWatcher2021-08-08T18:51:31.873568802Z (0x2158ec0, 0xc000869b00, 0x1d1ec2d, 0xf, 0xc000f2df00, 0xc00000e9b8, 0x0)
2021-08-08T18:51:31.873568802Z  /opt/app-root/src/go/src/github.com/tektoncd/operator/vendor/knative.dev/pkg/metrics/exporter.go:103 +0x26
2021-08-08T18:51:31.873568802Z knative.dev/pkg/injection/sharedmain.WatchObservabilityConfigOrDie(0x2158ec0, 0xc000869b00, 0xc0001816e0, 0xc000319520, 0xc00000e9b8, 0x1d1ec2d, 0xf)
2021-08-08T18:51:31.873568802Z  /opt/app-root/src/go/src/github.com/tektoncd/operator/vendor/knative.dev/pkg/injection/sharedmain/main.go:328 +2021-08-08T18:51:31.873576797Z 0x499
2021-08-08T18:51:31.873576797Z knative.dev/pkg/injection/sharedmain.MainWithConfig(0x2159a40, 0xc0002f00b0, 2021-08-08T18:51:31.873583218Z 0x1d1ec2d, 0xf, 0xc000176240, 0xc000867f582021-08-08T18:51:31.873589085Z , 0x4, 0x4)
2021-08-08T18:51:31.873589085Z  /opt/app-root/src/go/src/github.com/tektoncd/operator/vendor/knative.dev/pkg/injection/sharedmain/main.go2021-08-08T18:51:31.873595214Z :201 +0x6e8
2021-08-08T18:51:31.873595214Z knative.dev/pkg/injection/sharedmain.MainWithContext2021-08-08T18:51:31.873601244Z (0x2159a40, 0xc0002f00b0, 0x1d1ec2d, 0xf2021-08-08T18:51:31.873616278Z , 0xc00083ff58, 0x4, 0x4)
2021-08-08T18:51:31.873616278Z  /opt/app-root/src/go/src/github.com/tektoncd/operator/vendor/knative.dev/pkg/injection/sharedmain/main.go:142 +0xd5
2021-08-08T18:51:31.873616278Z knative.dev/pkg/injection/sharedmain.Main(0x1d1ec2d2021-08-08T18:51:31.873623215Z , 0xf, 0xc00083ff58, 0x4, 0x42021-08-08T18:51:31.873629071Z )
2021-08-08T18:51:31.873629071Z  /opt/app-root/src/go/src/github.com/tektoncd/operator/vendor/knative.dev/pkg/injection/sharedmain/main.go:116 +0x9c2021-08-08T18:51:31.873635037Z 
2021-08-08T18:51:31.873635037Z main.main()
2021-08-08T18:51:31.873635037Z  /opt/app-root/src/go/src/github.com/tektoncd/operator/cmd/openshift/operator/main.go2021-08-08T18:51:31.873641284Z :28 +0x93
~~~

Found bug https://bugzilla.redhat.com/show_bug.cgi?id=1989677 but not sure if related

Deployment does have the: 
 env:
   - name: METRICS_DOMAIN
     value: tekton.dev/triggers


Some deployments contain values for NO_PROXY some do not, do not know if that matters.


Expected results: Pipelines should be upgraded successfully. 


Additional info:

Comment 3 Vincent Demeester 2021-08-09 15:46:23 UTC
I think the workaround here (before we do a bugfix release) is to add the following to the `openshift-pipelines-operator` deployment in `openshift-operators` namespace:

```
env:
  - name: METRICS_DOMAIN
    value: tekton.dev/operator

```

Comment 4 Vincent Demeester 2021-08-09 15:50:31 UTC
Based on https://github.com/operator-framework/operator-lifecycle-manager/blob/master/doc/design/subscription-config.md#configuring-operators-deployed-by-olm I think the "cleanest" workaround here would be to edit the OpenShift Pipelines operator with the following (in spec)

```
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  # […]
  name: openshift-pipeline-operator
  namespace: openshift-operators
  # […]
spec:
  # […]
  config:
    env:
      - name: METRICS_DOMAIN
        value: tekton.dev/operator
  name: openshift-pipelines-operator-rh
  # […]
```

Comment 8 Vincent Demeester 2021-11-25 09:26:57 UTC
Forgot to close this issue. This should have been fixed in 1.5.x


Note You need to log in before you can comment on or make changes to this bug.