Bug 1927321

Summary: openshift-apiserver Available is False with 3 pods not ready for a while during upgrade
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: kube-apiserverAssignee: Luis Sanchez <sanchezl>
Status: CLOSED ERRATA QA Contact: Ke Wang <kewang>
Severity: medium Docs Contact:
Priority: low    
Version: 4.7CC: aos-bugs, fabian, mf.flip, mfojtik, sanchezl, wking, xxia
Target Milestone: ---   
Target Release: 4.7.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-04-26 16:08:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1926867    
Bug Blocks:    

Comment 1 Xingxing Xia 2021-03-05 02:51:45 UTC
Luis Sanchez, hi, 4.7 already has a originally created tracker: bug 1912820. How about closing this bug 1927321 as DUP of that, and update the PR title bug ID? Thanks!

Comment 2 Luis Sanchez 2021-04-07 04:19:02 UTC
*** Bug 1946856 has been marked as a duplicate of this bug. ***

Comment 5 Ke Wang 2021-04-19 11:03:33 UTC
Did a upgrade from ocp 4.6.25 GA to 4.7 nightly,

$ oc get clusterversion -o json|jq ".items[0].status.history"
[
  {
    "completionTime": "2021-04-19T10:03:46Z",
    "image": "registry.ci.openshift.org/ocp/release:4.7.0-0.nightly-2021-04-17-022838",
    "startedTime": "2021-04-19T08:36:47Z",
    "state": "Completed",
    "verified": false,
    "version": "4.7.0-0.nightly-2021-04-17-022838"
  },
  {
    "completionTime": "2021-04-19T07:52:32Z",
    "image": "registry.ci.openshift.org/ocp/release@sha256:7f26b56dc31547a26ce1f67eeb59ecee92dc07f3622e203c51e39fd6d7bcc930",
    "startedTime": "2021-04-19T07:21:28Z",
    "state": "Completed",
    "verified": false,
    "version": "4.6.25"
  }
]

During upgrade, to use one script watch-apiserver-in-upgrade.sh is run to watch `oc get project.project` command: ./watch-apiserver-in-upgrade.sh | tee watch.log, after the Upgrade succeeded. checked the watch.log, 

$ grep -A35 "failed" watch.log # totally 4 count
2021-04-19T05:45:21-04:00 oc get project.project failed
Status:
  Conditions:
    Last Transition Time:  2021-04-19T09:44:08Z
    Message:               APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver ()
    Reason:                APIServerDeployment_UnavailablePod
    Status:                True
    Type:                  Degraded
    Last Transition Time:  2021-04-19T09:07:27Z
    Message:               All is well
    Reason:                AsExpected
    Status:                False
    Type:                  Progressing
    Last Transition Time:  2021-04-19T09:45:54Z
    Message:               All is well
    Reason:                AsExpected
    Status:                True
    Type:                  Available
apiserver-746dc6855-bhcpg   0/2   Pending   0     4m41s   <none>        <none>                                        <none>   <none>   apiserver=true,app=openshift-apiserver-a,openshift-apiserver-anti-affinity=true,pod-template-hash=746dc6855,revision=2
apiserver-746dc6855-flt9n   2/2   Running   0     38m     10.130.0.51   ip-10-0-153-121.ap-south-1.compute.internal   <none>   <none>   apiserver=true,app=openshift-apiserver-a,openshift-apiserver-anti-affinity=true,pod-template-hash=746dc6855,revision=2
apiserver-746dc6855-jkkv8   2/2   Running   0     37m     10.128.0.58   ip-10-0-218-88.ap-south-1.compute.internal    <none>   <none>   apiserver=true,app=openshift-apiserver-a,openshift-apiserver-anti-affinity=true,pod-template-hash=746dc6855,revision=2
openshift-apiserver   4.7.0-0.nightly-2021-04-17-022838   True   False   True   13s
ip-10-0-153-121.ap-south-1.compute.internal   Ready                         master   135m   v1.19.0+a5a0987
ip-10-0-156-160.ap-south-1.compute.internal   Ready,SchedulingDisabled      worker   127m   v1.20.0+7d0a2b2
ip-10-0-168-29.ap-south-1.compute.internal    NotReady,SchedulingDisabled   master   135m   v1.19.0+a5a0987
ip-10-0-169-12.ap-south-1.compute.internal    Ready                         worker   127m   v1.19.0+a5a0987
ip-10-0-203-107.ap-south-1.compute.internal   Ready                         worker   127m   v1.19.0+a5a0987
ip-10-0-218-88.ap-south-1.compute.internal    Ready                         master   135m   v1.19.0+a5a0987

--
2021-04-19T05:52:25-04:00 oc get project.project failed
Status:
  Conditions:
    Last Transition Time:  2021-04-19T09:44:08Z
    Message:               APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver ()
    Reason:                APIServerDeployment_UnavailablePod
    Status:                True
    Type:                  Degraded
    Last Transition Time:  2021-04-19T09:07:27Z
    Message:               All is well
    Reason:                AsExpected
    Status:                False
    Type:                  Progressing
    Last Transition Time:  2021-04-19T09:46:20Z
    Message:               All is well
    Reason:                AsExpected
    Status:                True
    Type:                  Available
apiserver-746dc6855-bhcpg   0/2   Pending   0     11m     <none>        <none>                                        <none>   <none>   apiserver=true,app=openshift-apiserver-a,openshift-apiserver-anti-affinity=true,pod-template-hash=746dc6855,revision=2
apiserver-746dc6855-flt9n   2/2   Running   0     45m     10.130.0.51   ip-10-0-153-121.ap-south-1.compute.internal   <none>   <none>   apiserver=true,app=openshift-apiserver-a,openshift-apiserver-anti-affinity=true,pod-template-hash=746dc6855,revision=2
apiserver-746dc6855-vpdtq   2/2   Running   0     4m31s   10.129.0.16   ip-10-0-168-29.ap-south-1.compute.internal    <none>   <none>   apiserver=true,app=openshift-apiserver-a,openshift-apiserver-anti-affinity=true,pod-template-hash=746dc6855,revision=2
openshift-apiserver   4.7.0-0.nightly-2021-04-17-022838   True   False   True   6m44s
ip-10-0-153-121.ap-south-1.compute.internal   Ready                      master   142m   v1.19.0+a5a0987
ip-10-0-156-160.ap-south-1.compute.internal   Ready                      worker   134m   v1.20.0+7d0a2b2
ip-10-0-168-29.ap-south-1.compute.internal    Ready                      master   142m   v1.20.0+7d0a2b2
ip-10-0-169-12.ap-south-1.compute.internal    Ready                      worker   134m   v1.20.0+7d0a2b2
ip-10-0-203-107.ap-south-1.compute.internal   Ready,SchedulingDisabled   worker   134m   v1.19.0+a5a0987
ip-10-0-218-88.ap-south-1.compute.internal    Ready,SchedulingDisabled   master   142m   v1.19.0+a5a0987
...

2021-04-19T05:53:27-04:00 oc get project.project failed
Status:
  Conditions:
    Last Transition Time:  2021-04-19T09:44:08Z
    Message:               APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver ()
    Reason:                APIServerDeployment_UnavailablePod
    Status:                True
    Type:                  Degraded
    Last Transition Time:  2021-04-19T09:07:27Z
    Message:               All is well
    Reason:                AsExpected
    Status:                False
    Type:                  Progressing
    Last Transition Time:  2021-04-19T09:53:09Z
    Message:               APIServicesAvailable: "apps.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
APIServicesAvailable: "authorization.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
APIServicesAvailable: "build.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
APIServicesAvailable: "image.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
APIServicesAvailable: "project.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
APIServicesAvailable: "quota.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
APIServicesAvailable: "route.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
APIServicesAvailable: "security.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
APIServicesAvailable: "template.openshift.io.v1" is not ready: 503 (the server is currently unable to handle the request)
    Reason:                APIServices_Error
    Status:                False
    Type:                  Available
apiserver-746dc6855-bhcpg   0/2   Pending   0     12m     <none>        <none>                                        <none>   <none>   apiserver=true,app=openshift-apiserver-a,openshift-apiserver-anti-affinity=true,pod-template-hash=746dc6855,revision=2
apiserver-746dc6855-flt9n   2/2   Running   0     46m     10.130.0.51   ip-10-0-153-121.ap-south-1.compute.internal   <none>   <none>   apiserver=true,app=openshift-apiserver-a,openshift-apiserver-anti-affinity=true,pod-template-hash=746dc6855,revision=2
apiserver-746dc6855-vpdtq   2/2   Running   0     4m59s   10.129.0.16   ip-10-0-168-29.ap-south-1.compute.internal    <none>   <none>   apiserver=true,app=openshift-apiserver-a,openshift-apiserver-anti-affinity=true,pod-template-hash=746dc6855,revision=2
openshift-apiserver   4.7.0-0.nightly-2021-04-17-022838   False   False   True   23s
ip-10-0-153-121.ap-south-1.compute.internal   Ready                      master   143m   v1.19.0+a5a0987
ip-10-0-156-160.ap-south-1.compute.internal   Ready                      worker   134m   v1.20.0+7d0a2b2
ip-10-0-168-29.ap-south-1.compute.internal    Ready                      master   143m   v1.20.0+7d0a2b2
ip-10-0-169-12.ap-south-1.compute.internal    Ready                      worker   134m   v1.20.0+7d0a2b2
ip-10-0-203-107.ap-south-1.compute.internal   Ready,SchedulingDisabled   worker   134m   v1.19.0+a5a0987
ip-10-0-218-88.ap-south-1.compute.internal    Ready,SchedulingDisabled   master   143m   v1.19.0+a5a0987
--
2021-04-19T06:00:08-04:00 oc get project.project failed
Status:
  Conditions:
    Last Transition Time:  2021-04-19T09:44:08Z
    Message:               APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver ()
    Reason:                APIServerDeployment_UnavailablePod
    Status:                True
    Type:                  Degraded
    Last Transition Time:  2021-04-19T09:07:27Z
    Message:               All is well
    Reason:                AsExpected
    Status:                False
    Type:                  Progressing
    Last Transition Time:  2021-04-19T10:00:23Z
    Message:               All is well
    Reason:                AsExpected
    Status:                True
    Type:                  Available
apiserver-746dc6855-2fxrx   0/2   Pending   0     5m4s   <none>        <none>                                       <none>   <none>   apiserver=true,app=openshift-apiserver-a,openshift-apiserver-anti-affinity=true,pod-template-hash=746dc6855,revision=2
apiserver-746dc6855-bhcpg   2/2   Running   0     19m    10.128.0.8    ip-10-0-218-88.ap-south-1.compute.internal   <none>   <none>   apiserver=true,app=openshift-apiserver-a,openshift-apiserver-anti-affinity=true,pod-template-hash=746dc6855,revision=2
apiserver-746dc6855-vpdtq   2/2   Running   0     12m    10.129.0.16   ip-10-0-168-29.ap-south-1.compute.internal   <none>   <none>   apiserver=true,app=openshift-apiserver-a,openshift-apiserver-anti-affinity=true,pod-template-hash=746dc6855,revision=2
openshift-apiserver   4.7.0-0.nightly-2021-04-17-022838   True   False   True   18s
ip-10-0-153-121.ap-south-1.compute.internal   NotReady,SchedulingDisabled   master   150m   v1.19.0+a5a0987
ip-10-0-156-160.ap-south-1.compute.internal   Ready                         worker   141m   v1.20.0+7d0a2b2
ip-10-0-168-29.ap-south-1.compute.internal    Ready                         master   150m   v1.20.0+7d0a2b2
ip-10-0-169-12.ap-south-1.compute.internal    Ready                         worker   141m   v1.20.0+7d0a2b2
ip-10-0-203-107.ap-south-1.compute.internal   Ready                         worker   141m   v1.20.0+7d0a2b2
ip-10-0-218-88.ap-south-1.compute.internal    Ready                         master   150m   v1.20.0+7d0a2b2


Checked the detail of above error from watch.log, the errors have nothing to do with this bug,  caused this is that apiserver resided master node is in SchedulingDisabled, after that node is ready, no errors. 

The bug was fixed as expected, so move the bug VERIFIED.

Comment 7 errata-xmlrpc 2021-04-26 16:08:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.8 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:1225