Bug 1926867 - openshift-apiserver Available is False with 3 pods not ready for a while during upgrade
Summary: openshift-apiserver Available is False with 3 pods not ready for a while duri...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.7
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ---
: 4.8.0
Assignee: Luis Sanchez
QA Contact: Ke Wang
URL:
Whiteboard:
Depends On: 1912820 1946856
Blocks: 1927321
TreeView+ depends on / blocked
 
Reported: 2021-02-09 15:19 UTC by Luis Sanchez
Modified: 2021-11-15 09:29 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of: 1912820
Environment:
Last Closed: 2021-07-27 22:42:47 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-kube-apiserver-operator pull 1036 0 None open Bug 1926867: competing connectivitycheckcontrollers cause pod restarts during upgrades 2021-02-09 15:22:51 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:44:35 UTC

Comment 2 Ke Wang 2021-02-24 09:39:53 UTC
To verify, did a upgrade from ocp 4.7 GA to 4.8,

$ oc get clusterversion -o json|jq ".items[0].status.history"
[
  {
    "completionTime": "2021-02-23T21:18:35Z",
    "image": "registry.ci.openshift.org/ocp/release:4.8.0-0.nightly-2021-02-22-211839",
    "startedTime": "2021-02-23T19:18:05+08:00",
    "state": "Completed",
    "verified": false,
    "version": "4.8.0-0.nightly-2021-02-22-211839"
  },
  {
    "completionTime": "2021-02-23T18:57:36Z",
    "image": "quay.io/openshift-release-dev/ocp-release@sha256:d74b1cfa81f8c9cc23336aee72d8ae9c9905e62c4874b071317a078c316f8a70",
    "startedTime": "2021-02-23T18:30:46Z",
    "state": "Completed",
    "verified": false,
    "version": "4.7.0"
  }
]

During upgrade, to use one script watch-apiserver-in-upgrade.sh is run to watch `oc get project.project` command: ./watch-apiserver-in-upgrade.sh | tee watch.log, after the Upgrade succeeded. checked the watch.log, 

$ grep "failed" watch.log # totally 1 count
2021-02-23T21:10:24+08:00 oc get project.project failed

Checked the detail of above error from watch.log, the error has nothing to do with this bug,  caused this is that apiserver resided master node is in SchedulingDisabled, after that node is ready, no errors.
...
2021-02-23T21:10:15+08:00 oc get cm succeeded
version   4.7.0   True   True   112m   Working towards 4.8.0-0.nightly-2021-02-22-211839: 561 of 669 done (83% complete), waiting on machine-config
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get projects.project.openshift.io)
2021-02-23T21:10:24+08:00 oc get project.project failed
Status:
  Conditions:
    Last Transition Time:  2021-02-23T13:06:11Z
    Message:               APIServerDeploymentDegraded: 1 of 3 requested instances are unavailable for apiserver.openshift-apiserver ()
    Reason:                APIServerDeployment_UnavailablePod
    Status:                True
    Type:                  Degraded
    Last Transition Time:  2021-02-23T13:10:08Z
    Message:               All is well
    Reason:                AsExpected
    Status:                False
    Type:                  Progressing
    Last Transition Time:  2021-02-23T13:10:08Z
    Message:               All is well
    Reason:                AsExpected
    Status:                True
    Type:                  Available
apiserver-7dd9b5b8cc-87h8q   0/2   Pending   0     2m25s   <none>        <none>                      <none>   <none>   apiserver=true,app=openshift-apiserver-a,openshift-apiserver-anti-affinity=true,pod-template-hash=7dd9b5b8cc,revision=1
apiserver-7dd9b5b8cc-nv8hj   2/2   Running   0     91m     10.130.0.72   kewang2373-kjmhk-master-2   <none>   <none>   apiserver=true,app=openshift-apiserver-a,openshift-apiserver-anti-affinity=true,pod-template-hash=7dd9b5b8cc,revision=1
apiserver-7dd9b5b8cc-z9xl4   2/2   Running   0     6m43s   10.129.0.11   kewang2373-kjmhk-master-0   <none>   <none>   apiserver=true,app=openshift-apiserver-a,openshift-apiserver-anti-affinity=true,pod-template-hash=7dd9b5b8cc,revision=1
openshift-apiserver   4.8.0-0.nightly-2021-02-22-211839   True   False   True   39s
kewang2373-kjmhk-master-0         Ready                      master   154m   v1.20.0+01ab7fd
kewang2373-kjmhk-master-1         Ready,SchedulingDisabled   master   155m   v1.20.0+ba45583
kewang2373-kjmhk-master-2         Ready                      master   154m   v1.20.0+ba45583
kewang2373-kjmhk-worker-0-67k77   Ready                      worker   146m   v1.20.0+01ab7fd
kewang2373-kjmhk-worker-0-d5tp5   Ready,SchedulingDisabled   worker   146m   v1.20.0+ba45583
kewang2373-kjmhk-worker-0-tzfsc   Ready                      worker   141m   v1.20.0+ba45583
2021-02-23T21:10:49+08:00 oc get cm succeeded
...

From above results, the bug was fixed as expected, so move the bug VERIFIED.

Comment 5 errata-xmlrpc 2021-07-27 22:42:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.