Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1783221

Summary: CVO got panic when downgrading to 4.2.10
Product: OpenShift Container Platform Reporter: Gaoyun Pei <gpei>
Component: Cluster Version OperatorAssignee: W. Trevor King <wking>
Status: CLOSED ERRATA QA Contact: Gaoyun Pei <gpei>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.2.zCC: aos-bugs, ccoleman, jokerman, padillon, wking
Target Milestone: ---   
Target Release: 4.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1798049 (view as bug list) Environment:
Last Closed: 2020-05-13 21:55:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1798049    

Comment 2 Clayton Coleman 2019-12-16 15:07:54 UTC
This must be fixed for 4.3 GA, we may not have non downgrading working.

Comment 4 W. Trevor King 2019-12-17 00:06:01 UTC
$ oc adm release info --commits registry.svc.ci.openshift.org/ocp/release:4.3.0-0.nightly-2019-12-12-155629 | grep version
  cluster-version-operator                      https://github.com/openshift/cluster-version-operator                      da28418b76e0a4c2f2946a914ac2c649dbaf1dc5

so the stack trace hits [1] and [2].  I bet the "index out of range" is from [2]'s:

  existingCurr = &(*existing)[i]

because we don't re-enter the loop over existing [3] when we drop an entry [4].  This might be a common pattern among our resourcemerge implementations.

[1]: https://github.com/openshift/cluster-version-operator/blob/e5e468961b5fd687f65844d511690d7ed0046447/pkg/payload/task_graph.go#L591
[2]: https://github.com/openshift/cluster-version-operator/blob/e5e468961b5fd687f65844d511690d7ed0046447/lib/resourcemerge/core.go#L69
[3]: https://github.com/openshift/cluster-version-operator/blob/e5e468961b5fd687f65844d511690d7ed0046447/lib/resourcemerge/core.go#L65
[4]: https://github.com/openshift/cluster-version-operator/blob/e5e468961b5fd687f65844d511690d7ed0046447/lib/resourcemerge/core.go#L76

Comment 7 Gaoyun Pei 2020-02-06 09:42:41 UTC
Verify this bug in 4.4.0-0.nightly-2020-02-05-220946

1. Install a latest 4.4 nightly cluster on AWS
# ./oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.4.0-0.nightly-2020-02-05-220946   True        False         4h8m    Cluster version is 4.4.0-0.nightly-2020-02-05-220946

2. Downgrade the cluster to 4.3.0
# ./oc adm upgrade --to-image='quay.io/openshift-release-dev/ocp-release@sha256:3a516480dfd68e0f87f702b4d7bdd6f6a0acfdac5cd2e9767b838ceede34d70d' --allow-explicit-upgrade
Updating to release image quay.io/openshift-release-dev/ocp-release@sha256:3a516480dfd68e0f87f702b4d7bdd6f6a0acfdac5cd2e9767b838ceede34d70d

3. Downgrade finished, CVO pod is running well, no panic happened.
# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.0     True        False         76m     Cluster version is 4.3.0

# oc get pod -n openshift-cluster-version
NAME                                        READY   STATUS    RESTARTS   AGE
cluster-version-operator-584fddff45-wgjps   1/1     Running   1          101m


4. Several operator kept in 4.4.0-0.nightly-2020-02-05-220946 while CVO shows the downgrade is finished.
# oc get co
NAME                                       VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.3.0                               True        False         False      6h31m
cloud-credential                           4.3.0                               True        False         False      6h52m
cluster-autoscaler                         4.3.0                               True        False         False      6h40m
console                                    4.3.0                               True        False         False      112m
csi-snapshot-controller                    4.4.0-0.nightly-2020-02-05-220946   True        False         False      112m
dns                                        4.3.0                               True        False         False      6h44m
etcd                                       4.4.0-0.nightly-2020-02-05-220946   True        False         False      107m
image-registry                             4.3.0                               True        False         False      113m
ingress                                    4.3.0                               True        False         False      112m
insights                                   4.3.0                               True        False         False      6h46m
kube-apiserver                             4.3.0                               True        False         False      6h43m
kube-controller-manager                    4.3.0                               True        False         False      6h43m
kube-scheduler                             4.3.0                               True        False         False      6h43m
kube-storage-version-migrator              4.4.0-0.nightly-2020-02-05-220946   True        False         False      114m
machine-api                                4.3.0                               True        False         False      6h44m
machine-config                             4.3.0                               True        False         False      6h45m
marketplace                                4.3.0                               True        False         False      109m
monitoring                                 4.3.0                               True        False         False      106m
network                                    4.3.0                               True        False         False      6h46m
node-tuning                                4.3.0                               True        False         False      107m
openshift-apiserver                        4.3.0                               True        False         False      107m
openshift-controller-manager               4.3.0                               True        False         False      6h43m
openshift-samples                          4.3.0                               True        False         False      134m
operator-lifecycle-manager                 4.3.0                               True        False         False      6h45m
operator-lifecycle-manager-catalog         4.3.0                               True        False         False      6h44m
operator-lifecycle-manager-packageserver   4.3.0                               True        False         False      108m
service-ca                                 4.3.0                               True        False         False      6h46m
service-catalog-apiserver                  4.3.0                               True        False         False      6h46m
service-catalog-controller-manager         4.3.0                               True        False         False      6h46m
storage                                    4.3.0                               True        False         False      136m

# oc get clusterversion -o json|jq -r '.items[0].status.history[]|.startedTime + "|" + .completionTime + "|" + .state + "|" + .version'
2020-02-06T07:03:49Z|2020-02-06T07:47:17Z|Completed|4.3.0
2020-02-06T02:32:16Z|2020-02-06T02:53:34Z|Completed|4.4.0-0.nightly-2020-02-05-220946

Found an existing bug https://bugzilla.redhat.com/show_bug.cgi?id=1794360 about downgrade issue from 4.4 to 4.3, will track the issue 4 in BZ#1794360 , move this bug to VERIFIED.

Comment 9 errata-xmlrpc 2020-05-13 21:55:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581