Bug 1733305 - Status for clusteroperator/kube-apiserver changed: Degraded during upgrade
Summary: Status for clusteroperator/kube-apiserver changed: Degraded during upgrade
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Etcd
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
low
high
Target Milestone: ---
: ---
Assignee: Sam Batschelet
QA Contact: ge liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-25 16:18 UTC by Petr Muller
Modified: 2020-05-20 10:51 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-20 10:51:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Petr Muller 2019-07-25 16:18:06 UTC
Description of problem:

Seeing a lot of error messages in the release-openshift-origin-installer-e2e-aws-upgrade-4.1 job:

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.1/206

Jul 25 14:26:44.336 I ns/openshift-kube-apiserver-operator deployment/kube-apiserver-operator Status for clusteroperator/kube-apiserver changed: Degraded message changed from "" to "StaticPodsDegraded: nodes/ip-10-0-143-150.ec2.internal pods/kube-apiserver-ip-10-0-143-150.ec2.internal container=\"kube-apiserver-7\" is not ready\nStaticPodsDegraded: nodes/ip-10-0-143-150.ec2.internal pods/kube-apiserver-ip-10-0-143-150.ec2.internal container=\"kube-apiserver-7\" is terminated: \"Error\" - \"esetting endpoints for master service \\\"kubernetes\\\" to [10.0.158.95 10.0.163.120]

log.go:172] suppressing panic for copyResponse error in test; copy error: context canceled

Comment 1 Michal Fojtik 2019-07-26 09:01:11 UTC
That test failed because of:

Jul 25 14:27:12.603: INFO: cluster upgrade is failing: Cluster operator machine-config is still updating
Jul 25 14:34:02.602: INFO: cluster upgrade is failing: Could not update deployment "openshift-machine-config-operator/etcd-quorum-guard" (315 of 350)
Jul 25 14:34:12.605: INFO: cluster upgrade is failing: Could not update deployment "openshift-machine-config-operator/etcd-quorum-guard" (315 of 350)
Jul 25 14:34:22.601: INFO: cluster upgrade is failing: Could not update deployment "openshift-machine-config-operator/etcd-quorum-guard" (315 of 350)

The upgrade got stucked at this point and timed out. The degraded transition is expected, it is not a bug. The fact the upgrade got stucked on updating etcd-quorum-guard is.

Sam, is there known bug about this?

Comment 4 Sam Batschelet 2019-08-21 02:14:41 UTC
This is actually not a duplicate of 1742744[1] I am going to reopen this for further review.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1742744#c4

Comment 8 Ben Parees 2020-04-29 20:58:54 UTC
If "resetting endpoints for master service" is the signal on this bug, it is showing up quite a bit in recent searches:

https://search-clayton-ci-search.apps.build01.ci.devcluster.openshift.com/?search=resetting+endpoints+for+master+service&maxAge=336h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job


2.4% of all recent job runs show it.


Note You need to log in before you can comment on or make changes to this bug.