Bug 1767156

Summary: During upgrades openshift-apiserver reports degraded with MultipleAvailable as the reason
Product: OpenShift Container Platform Reporter: Jessica Forrester <jforrest>
Component: openshift-apiserverAssignee: Stefan Schimanski <sttts>
Status: CLOSED ERRATA QA Contact: Xingxing Xia <xxia>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.2.zCC: aos-bugs, deads, mfojtik
Target Milestone: ---   
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: During upgrades openshift-apiserver reports degraded with MultipleAvailable. Consequence: The reason for degration was not understandable for the user. Fix: A list of the reason, underlying reasons is reported. Result: The reasons are understandable and no information is hidden.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-23 11:10:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Jessica Forrester 2019-10-30 19:42:13 UTC
We are seeing a number of upgrades where the openshift-apiserver goes degraded but reports MultipleAvailable as the reason. We suspect the underlying cause is the SDN, but this particular BZ report is to break out the MultipleAvailable reason because its actually making it harder to pinpoint the problem.

Opened at the request of David.

Comment 3 Xingxing Xia 2019-11-29 09:31:04 UTC
Sorry for late verifying this bug due to engaged in other ON_QA apiserver bugs and other testings.
Tried to verifying it When team do upgrade testing: watched the operator status, didn't find "reason" field with multiple reasons as listed in the PR.
Then tried `while true` loop to delete OAS ds, pods and OAS apiservices in parallel, only found ONE reason "NoAPIServerPod", still no multiple ones:
"reason": "NoAPIServerPod"

Then tried deleting svc:
while true; do oc delete svc api -n openshift-apiserver; done
Now can get multiple reasons:
oc get openshiftapiserver cluster -o json | jq -r '.status.conditions[] | select(.type == "Available")'
{
  "lastTransitionTime": "2019-11-29T08:51:31Z",
  "message": "apiservice/v1.apps.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.authorization.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.build.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.image.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.oauth.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.project.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.quota.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.route.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.security.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.template.openshift.io: not available: service/api in \"openshift-apiserver\" is not present\napiservice/v1.user.openshift.io: not available: service/api in \"openshift-apiserver\" is not present",
  "reason": "APIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable\nAPIServiceNotAvailable",
  "status": "False",
  "type": "Available"
}

In terms of this, the issue can be verified. But if deleting svc and pods in parallel:
while true; do oc delete pod -l apiserver -n openshift-apiserver; done # in terminal A
while true; do oc delete svc api -n openshift-apiserver; done # in terminal B
Only get one reason NoAPIServerPod, no above APIServiceNotAvailable multiple reasons:
oc get openshiftapiserver cluster -o json | jq -r '.status.conditions[] | select(.type == "Available")'
{
  "lastTransitionTime": "2019-11-29T08:51:31Z",
  "message": "no openshift-apiserver daemon pods available on any node.",
  "reason": "NoAPIServerPod",
  "status": "False",
  "type": "Available"
}
If this is expected, please move back, I'll move to VERIFIED then. Tested version: 4.3.0-0.nightly-2019-11-28-233859. Thanks

Comment 5 errata-xmlrpc 2020-01-23 11:10:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062