Bug 1687973 - UPGRADE network operator reports unavailable during upgrade
Summary: UPGRADE network operator reports unavailable during upgrade
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.1.0
Assignee: Dan Winship
QA Contact: Meng Bo
URL:
Whiteboard: beta3blocker
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-12 19:31 UTC by Derek Carr
Modified: 2019-06-04 10:45 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-04 10:45:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 121 0 None None None 2019-03-13 20:19:34 UTC
Red Hat Product Errata RHBA-2019:0758 0 None None None 2019-06-04 10:45:41 UTC

Description Derek Carr 2019-03-12 19:31:18 UTC
Description of problem:

Installed 4.0.0-0.alpha-2019-03-12-143341 
Upgraded to 4.0.0-0.alpha-2019-03-12-153711

During cluster upgrade, the network operator reported unavailable.  It appeared to report unavailable when a master machine is rebooted.  Opening this bug to determine if available=false should occur during a machine reboot.  The network did upgrade, but available toggled.

Expected results:
Available should not go false during an upgrade.

Comment 1 Derek Carr 2019-03-13 03:07:24 UTC
this is the next reason we will fail upgrade tests as we try to ensure no operator goes unavailable during upgrades.

https://github.com/openshift/cluster-network-operator/blob/f4ef74c2d9179c7ccfecafb846f3fc800de01223/pkg/controller/statusmanager/status_manager.go#L239

this appears like problematic logic as the way i understand the flow, if the network operator is progressing to a new version it reports unavailable, even though a network is obviously available during the rollout across release versions.

Comment 2 Derek Carr 2019-03-13 03:10:13 UTC
see sample upgrade job runs:
https://deck-ci.svc.ci.openshift.org/?job=release-openshift-origin-installer-e2e-aws-upgrade-4.0

https://deck-ci.svc.ci.openshift.org/log?job=release-openshift-origin-installer-e2e-aws-upgrade-4.0&id=97

"version changed Failing to True: ClusterOperatorNotAvailable: Cluster operator network is still updating"

Comment 3 Casey Callendrello 2019-03-13 12:13:03 UTC
To danw.

So, we should be setting "available=true progressing=true"? What is the expected state as the daemonset updates roll out?

Comment 4 zhaozhanqi 2019-03-19 10:40:21 UTC
Verified this bug when upgraded from  4.0.0-0.nightly-2019-03-15-063749 to 4.0.0-0.nightly-2019-03-18-200009
 
The AVAILABLE still be 'True' during upgrade.

Comment 6 errata-xmlrpc 2019-06-04 10:45:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758


Note You need to log in before you can comment on or make changes to this bug.