Bug 1767156 - During upgrades openshift-apiserver reports degraded with MultipleAvailable as the reason
Summary: During upgrades openshift-apiserver reports degraded with MultipleAvailable a...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: openshift-apiserver
Version: 4.2.z
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.3.0
Assignee: Stefan Schimanski
QA Contact: Xingxing Xia
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-10-30 19:42 UTC by Jessica Forrester
Modified: 2020-01-23 11:10 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: During upgrades openshift-apiserver reports degraded with MultipleAvailable. Consequence: The reason for degration was not understandable for the user. Fix: A list of the reason, underlying reasons is reported. Result: The reasons are understandable and no information is hidden.
Clone Of:
Environment:
Last Closed: 2020-01-23 11:10:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-openshift-apiserver-operator pull 248 0 'None' closed Bug 1767156: Fix "Multipleavailable" reason in the code 2021-02-09 08:05:17 UTC
Red Hat Product Errata RHBA-2020:0062 0 None None None 2020-01-23 11:10:30 UTC

Description Jessica Forrester 2019-10-30 19:42:13 UTC
We are seeing a number of upgrades where the openshift-apiserver goes degraded but reports MultipleAvailable as the reason. We suspect the underlying cause is the SDN, but this particular BZ report is to break out the MultipleAvailable reason because its actually making it harder to pinpoint the problem.

Opened at the request of David.

Comment 3 Xingxing Xia 2019-11-29 09:31:04 UTC
Sorry for late verifying this bug due to engaged in other ON_QA apiserver bugs and other testings.
Tried to verifying it When team do upgrade testing: watched the operator status, didn't find "reason" field with multiple reasons as listed in the PR.
Then tried `while true` loop to delete OAS ds, pods and OAS apiservices in parallel, only found ONE reason "NoAPIServerPod", still no multiple ones:
"reason": "NoAPIServerPod"

Then tried deleting svc:
while true; do oc delete svc api -n openshift-apiserver; done
Now can get multiple reasons:
oc get openshiftapiserver cluster -o json | jq -r '.status.conditions[] | select(.type == "Available")'
{
  "lastTransitionTime": "2019-11-29T08:51:31Z",
  "message": "apiservice/v1.apps.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.authorization.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.build.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.image.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.oauth.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.project.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.quota.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.route.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.security.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.template.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.user.openshift.io: not available: service/api in \"openshift-apiserver\" is not present",
  "reason": "APIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable",
  "status": "False",
  "type": "Available"
}

In terms of this, the issue can be verified. But if deleting svc and pods in parallel:
while true; do oc delete pod -l apiserver -n openshift-apiserver; done # in terminal A
while true; do oc delete svc api -n openshift-apiserver; done # in terminal B
Only get one reason NoAPIServerPod, no above APIServiceNotAvailable multiple reasons:
oc get openshiftapiserver cluster -o json | jq -r '.status.conditions[] | select(.type == "Available")'
{
  "lastTransitionTime": "2019-11-29T08:51:31Z",
  "message": "no openshift-apiserver daemon pods available on any node.",
  "reason": "NoAPIServerPod",
  "status": "False",
  "type": "Available"
}
If this is expected, please move back, I'll move to VERIFIED then. Tested version: 4.3.0-0.nightly-2019-11-28-233859. Thanks

Comment 5 errata-xmlrpc 2020-01-23 11:10:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062


Note You need to log in before you can comment on or make changes to this bug.