Created attachment 1844118 [details] CVO log file Description of problem: When using a dummy cincinnati graph which includes invalid conditional edges, CVO keeps restarting and goes to CrashLoopBackOff eventually. Cincinnati graph is available online: https://raw.githubusercontent.com/shellyyang1989/upgrade-cincy/master/cincy-conditional-edge-invalid.json Snipped CVO log: 6607 W1130 07:10:17.721748 1 cincinnati.go:220] Conditional update to 4.10.0-0.nightly-2021-11-26-195620, risk "TypeNull", has emp ty pruned matchingRules; dropping this target to avoid rejections when pushing to the Kubernetes API server. Pruning results: Skipp ing unrecognized cluster condition type "" 6608 I1130 07:10:17.721793 1 cvo.go:582] Finished syncing available updates "openshift-cluster-version/version" (136.217034ms) 6609 E1130 07:10:17.721916 1 runtime.go:78] Observed a panic: runtime.boundsError{x:0, y:0, signed:true, code:0x0} (runtime error: index out of range [0] with length 0) 6610 goroutine 177 [running]: 6611 k8s.io/apimachinery/pkg/util/runtime.logPanic({0x196f5a0, 0xc000eee2d0}) 6612 /go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x85 6613 k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000af1680}) 6614 /go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x75 6615 panic({0x196f5a0, 0xc000eee2d0}) 6616 /usr/lib/golang/src/runtime/panic.go:1038 +0x215 6617 github.com/openshift/cluster-version-operator/pkg/cincinnati.Client.GetUpdates({{0xa2, 0x74, 0x1e, 0xaf, 0x95, 0xa0, 0x44, 0x40, 0x 97, 0x24, ...}, ...}, ...) 6618 /go/src/github.com/openshift/cluster-version-operator/pkg/cincinnati/cincinnati.go:218 +0x2f14 6619 github.com/openshift/cluster-version-operator/pkg/cvo.calculateAvailableUpdatesStatus({0x1ce3a10, 0xc000ecf0c0}, {0xc001404120, 0x2 4}, 0xc00017a140, {0xc001270000, 0x69}, {0x1a653f5, 0x5}, {0xc0025004b3, ...}, ...) 6620 /go/src/github.com/openshift/cluster-version-operator/pkg/cvo/availableupdates.go:226 +0x9c5 6621 github.com/openshift/cluster-version-operator/pkg/cvo.(*Operator).syncAvailableUpdates(0xc00060a240, {0x1ce3a10, 0xc000ecf0c0}, 0xc 001ec6000) 6622 /go/src/github.com/openshift/cluster-version-operator/pkg/cvo/availableupdates.go:53 +0x353 6623 github.com/openshift/cluster-version-operator/pkg/cvo.(*Operator).availableUpdatesSync(0xc00060a240, {0x1ce3a10, 0xc000ecf0c0}, {0x c00040ab70, 0x21}) 6624 /go/src/github.com/openshift/cluster-version-operator/pkg/cvo/cvo.go:592 +0x3dc 6625 github.com/openshift/cluster-version-operator/pkg/cvo.processNextWorkItem({0x1ce3a10, 0xc000ecf0c0}, {0x1d15a88, 0xc0002f6940}, 0xc 0016e7d68, 0x8) <snipped> Version-Release number of the following components: 4.10.0-0.nightly-2021-11-26-145635 How reproducible: 100% Steps to Reproduce: 1. Prepare a dummy cincinnati graph which should have invalid conditional edge in it, like https://raw.githubusercontent.com/shellyyang1989/upgrade-cincy/master/cincy-conditional-edge-invalid.json 2. Patch cluster to use the graph # oc patch clusterversion/version --patch '{"spec":{"upstream":"https://raw.githubusercontent.com/shellyyang1989/upgrade-cincy/master/cincy-conditional-edge-invalid.json"}}' --type=merge Actual results: CVO crashes. # oc get all -n openshift-cluster-version NAME READY STATUS RESTARTS AGE pod/cluster-version-operator-7b58db6899-sjsnh 0/1 CrashLoopBackOff 32 (90s ago) 28h NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/cluster-version-operator ClusterIP 172.30.16.143 <none> 9099/TCP 28h NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/cluster-version-operator 0/1 1 0 28h NAME DESIRED CURRENT READY AGE replicaset.apps/cluster-version-operator-7b58db6899 1 1 0 28h Expected results: CVO can detect incorrect configuration and prompt errors Additional info: Please attach logs from ansible-playbook with the -vvv flag
Verifying with 4.10.0-0.nightly-2021-12-01-164437 1. Install a cluster with 4.10.0-0.nightly-2021-12-01-164437 # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2021-12-01-164437 True False 86m Cluster version is 4.10.0-0.nightly-2021-12-01-164437 2. Patch to use the dummy cincinnati graph # oc adm upgrade --include-not-recommended Cluster version is 4.10.0-0.nightly-2021-12-01-164437 Upstream: https://raw.githubusercontent.com/shellyyang1989/upgrade-cincy/master/cincy-conditional-edge-invalid.json Channel: stable-4.10 No updates available. You may force an upgrade to a specific release image, but doing so may not be supported and may result in downtime or data loss. No updates which are not recommended based on your cluster configuration are available. So, cvo drops all the invalid conditional edges. 3. Check CVO # oc get all -n openshift-cluster-version NAME READY STATUS RESTARTS AGE pod/cluster-version-operator-6f5d9777dc-j7vvj 1/1 Running 0 109m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/cluster-version-operator ClusterIP 172.30.137.0 <none> 9099/TCP 109m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/cluster-version-operator 1/1 1 1 110m NAME DESIRED CURRENT READY AGE replicaset.apps/cluster-version-operator-6f5d9777dc 1 1 1 109m CVO is running well. Moving it to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056