Bug 1928157 - 4.7 CNO claims to be done upgrading before it even starts
Summary: 4.7 CNO claims to be done upgrading before it even starts
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.7
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.8.0
Assignee: Dan Winship
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks: 1929371
TreeView+ depends on / blocked
 
Reported: 2021-02-12 14:04 UTC by Dan Winship
Modified: 2021-07-27 22:44 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: On upgrade from 4.6 to 4.7, the network operator immediately marked itself as having fully upgraded, before it began rolling out the 4.7 versions of the openshift-sdn/ovn-kubernetes pods. Consequence: If the network plugin upgrade fails, the cluster might mistakenly report that it had fully upgraded to 4.7 despite some nodes still running a 4.6 network plugin. (The network operator would be Degraded in this case, but it would mistakenly be reporting that it was 4.7 and Degraded, rather than 4.6 and Degraded.) Alternatively, it is possible that even if the network upgrade succeeds, the upgrade as a whole might fail due to a later step being disrupted by having the network upgrade occur at the same time. Fix: The network operator now correctly waits for the network plugin to be upgraded to the 4.7 images before declaring itself to be upgraded. Result: Version reporting should be correct. Upgrades should proceed in proper sequence.
Clone Of:
Environment:
Last Closed: 2021-07-27 22:44:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 979 0 None closed Bug 1928157: Don't set ClusterOperator Version until rollout is complete 2021-02-17 16:53:58 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:44:35 UTC

Description Dan Winship 2021-02-12 14:04:14 UTC
https://github.com/openshift/cluster-network-operator/pull/863 (specifically https://github.com/openshift/cluster-network-operator/pull/863/commits/fc4e745f) broke CNO so that now as soon as it comes up, it will update the Versions field in its ClusterOperator status, effectively claiming that it is done upgrading to the new version before it has even started upgrading its operands, which I guess may mean that CVO will start updating other things in parallel with the network that were supposed to have been updated *after* the network?

Comment 3 zhaozhanqi 2021-02-22 11:28:13 UTC
Verified this bug on 4.8.0-0.nightly-2021-02-21-102854

CNO version still old during upgrade.  

NAME                                       VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.8.0-0.nightly-2021-02-21-102854   True        False         False      38m
baremetal                                  4.8.0-0.nightly-2021-02-21-102854   True        False         False      73m
cloud-credential                           4.8.0-0.nightly-2021-02-21-102854   True        False         False      75m
cluster-autoscaler                         4.8.0-0.nightly-2021-02-21-102854   True        False         False      72m
config-operator                            4.8.0-0.nightly-2021-02-21-102854   True        False         False      73m
console                                    4.8.0-0.nightly-2021-02-21-102854   True        False         False      8m43s
csi-snapshot-controller                    4.8.0-0.nightly-2021-02-21-102854   True        False         False      47m
dns                                        4.7.0-rc.3                          True        False         False      71m
etcd                                       4.8.0-0.nightly-2021-02-21-102854   True        False         False      71m
image-registry                             4.8.0-0.nightly-2021-02-21-102854   True        False         False      47m
ingress                                    4.8.0-0.nightly-2021-02-21-102854   True        False         False      62m
insights                                   4.8.0-0.nightly-2021-02-21-102854   True        False         False      66m
kube-apiserver                             4.8.0-0.nightly-2021-02-21-102854   True        False         False      70m
kube-controller-manager                    4.8.0-0.nightly-2021-02-21-102854   True        False         False      71m
kube-scheduler                             4.8.0-0.nightly-2021-02-21-102854   True        False         False      70m
kube-storage-version-migrator              4.8.0-0.nightly-2021-02-21-102854   True        False         False      47m
machine-api                                4.8.0-0.nightly-2021-02-21-102854   True        False         False      63m
machine-approver                           4.8.0-0.nightly-2021-02-21-102854   True        False         False      72m
machine-config                             4.7.0-rc.3                          True        False         False      71m
marketplace                                4.8.0-0.nightly-2021-02-21-102854   True        False         False      8m56s
monitoring                                 4.8.0-0.nightly-2021-02-21-102854   True        False         False      60m
network                                    4.7.0-rc.3                          True        True          False      73m
node-tuning                                4.8.0-0.nightly-2021-02-21-102854   True        False         False      9m13s
openshift-apiserver                        4.8.0-0.nightly-2021-02-21-102854   True        False         False      66m
openshift-controller-manager               4.8.0-0.nightly-2021-02-21-102854   True        False         False      65m
openshift-samples                          4.8.0-0.nightly-2021-02-21-102854   True        False         False      9m22s
operator-lifecycle-manager                 4.8.0-0.nightly-2021-02-21-102854   True        False         False      72m
operator-lifecycle-manager-catalog         4.8.0-0.nightly-2021-02-21-102854   True        False         False      72m
operator-lifecycle-manager-packageserver   4.8.0-0.nightly-2021-02-21-102854   True        False         False      9m9s
service-ca                                 4.8.0-0.nightly-2021-02-21-102854   True        False         False      73m
storage                                    4.8.0-0.nightly-2021-02-21-102854   True        False         False      46m

Comment 6 errata-xmlrpc 2021-07-27 22:44:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.