Bug 1926474 - Upgrade a cluster behind proxy from 4.6.15 to 4.7 fc5 fails
Summary: Upgrade a cluster behind proxy from 4.6.15 to 4.7 fc5 fails
Keywords:
Status: CLOSED DUPLICATE of bug 1920027
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Over the Air Updates
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-02-08 20:41 UTC by To Hung Sze
Modified: 2022-05-06 12:29 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-09 22:39:51 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description To Hung Sze 2021-02-08 20:41:01 UTC
Description of problem:
Upgrading a cluster behind proxy from 4.6.15 to 4.7 fc5 fails.
Upgrade switches between

$ ./oc adm upgrade
info: An upgrade is in progress. Working towards 4.7.0-fc.5: 70 of 668 done (10% complete)

and

$ ./oc adm upgrade
info: An upgrade is in progress. Unable to apply 4.7.0-fc.5: the cluster operator etcd is degraded


with
./oc get co
dns                                        4.7.0-fc.5   True        False         True       15h
etcd                                       4.7.0-fc.5   True        False         True       15h
machine-config                             4.6.15       False       True          True       78m
monitoring                                 4.7.0-fc.5   False       True          True       62m
storage                                    4.7.0-fc.5   True        True          False      71m

All others are fine at 4.7 fc5.

Additional information:
./oc describe co etcd
Status:
  Conditions:
    Last Transition Time:  2021-02-08T19:10:34Z
    Message:               NodeControllerDegraded: The master nodes not ready: node "tszegcp020721d-nlcqd-master-1.c.openshift-qe.internal" not ready since 2021-02-08 19:08:34 +0000 UTC because NodeStatusUnknown (Kubelet stopped posting node status.)
EtcdMembersDegraded: 2 of 3 members are available, tszegcp020721d-nlcqd-master-1.c.openshift-qe.internal is unhealthy
    Reason:                EtcdMembers_UnhealthyMembers::NodeController_MasterNodesReady
    Status:                True
    Type:                  Degraded
    Last Transition Time:  2021-02-08T12:56:22Z
    Message:               NodeInstallerProgressing: 3 nodes are at revision 4
EtcdMembersProgressing: No unstarted etcd members found
    Reason:                AsExpected
    Status:                False
    Type:                  Progressing
    Last Transition Time:  2021-02-08T05:08:08Z
    Message:               StaticPodsAvailable: 3 nodes are active; 3 nodes are at revision 4
EtcdMembersAvailable: 2 of 3 members are available, tszegcp020721d-nlcqd-master-1.c.openshift-qe.internal is unhealthy



./oc describe co machine-config
Status:
  Conditions:
    Last Transition Time:  2021-02-08T19:02:30Z
    Message:               Working towards 4.7.0-fc.5
    Status:                True
    Type:                  Progressing
    Last Transition Time:  2021-02-08T19:16:22Z
    Message:               Unable to apply 4.7.0-fc.5: timed out waiting for the condition during waitForDaemonsetRollout: Daemonset machine-config-daemon is not ready. status: (desired: 6, updated: 6, ready: 4, unavailable: 2)
    Reason:                MachineConfigDaemonFailed
    Status:                True
    Type:                  Degraded
    Last Transition Time:  2021-02-08T19:00:29Z
    Message:               Cluster not available for 4.7.0-fc.5
    Status:                False
    Type:                  Available
    Last Transition Time:  2021-02-08T05:06:00Z
    Reason:                AsExpected
    Status:                True
    Type:                  Upgradeable
  Extension:
    Master:  0 (ready 0) out of 3 nodes are updating to latest configuration rendered-master-5a4e1a4f083144ded1b1298336732de0
    Worker:  0 (ready 0) out of 3 nodes are updating to latest configuration rendered-worker-71442d4811577e17f5fa285aedf0f20d


Always reproducible - reproduced on gcp 2 times.

Similar defect:
https://bugzilla.redhat.com/show_bug.cgi?id=1920027
https://bugzilla.redhat.com/show_bug.cgi?id=1924383
https://bugzilla.redhat.com/show_bug.cgi?id=1924947

Comment 1 To Hung Sze 2021-02-08 21:32:37 UTC
I have the must-gather. Please ping me if you need access.

Comment 2 Scott Dodson 2021-02-08 21:36:33 UTC
Yes, please make it accessible to Red Hatters via google drive.

Comment 4 To Hung Sze 2021-02-08 21:53:57 UTC
BTW, also tried upgrading 4.6.16 to 4.7 fc5 and it always fail too (both IPI and cluster behind proxy).

Comment 5 To Hung Sze 2021-02-09 14:35:44 UTC
Could be because fc5 isn't new enough to contain fix from:
https://bugzilla.redhat.com/show_bug.cgi?id=1920027

Trying with a newer build. Will update here.

Comment 6 To Hung Sze 2021-02-09 22:39:51 UTC
I tried upgrading from 4.6.16 to 4.7.0-0.nightly-2021-02-09-024347 (with proxy) and it worked.

Closing as duplicate of 1920027.

*** This bug has been marked as a duplicate of bug 1920027 ***


Note You need to log in before you can comment on or make changes to this bug.