Bug 1821493

Summary: /readyz should start reporting failure on shutdown initiation
Product: OpenShift Container Platform Reporter: Abu Kashem <akashem>
Component: kube-apiserverAssignee: Abu Kashem <akashem>
Status: CLOSED ERRATA QA Contact: Ke Wang <kewang>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.4CC: aos-bugs, kewang, mfojtik, sttts, vlaad, xxia
Target Milestone: ---   
Target Release: 4.3.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1811198
: 1821494 (view as bug list) Environment:
Last Closed: 2020-04-20 17:08:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1811169, 1811198    
Bug Blocks: 1811200, 1821494, 1821495    

Comment 5 Ke Wang 2020-04-10 09:50:03 UTC
Verified with OCP build 4.3.0-0.nightly-2020-04-10-021124, detail see below,

- In one terminal, enter into master 
$ master=$(oc get nodes | grep master | head -1 | cut -d " " -f1)
$ oc debug node/$master
sh-4.2# chroot /host
sh-4.4# while true; do curl -k --silent --show-error https://localhost:6443/readyz ; done

- In another terminal,
$ debug_pod=$(oc get pod -A | grep debug | awk '{print $2}')
$ oc rsh pod/$debug_pod
sh-4.2# chroot /host
sh-4.4# kill -INT `ps aux | grep " kube-apiserver " | grep -v grep | awk '{print $2}'`

- In the first terminal, check the output, after above kill, can immediately see:
[+]ping ok
[+]log ok
[+]etcd ok
[+]poststarthook/generic-apiserver-start-informers ok
[+]poststarthook/start-apiextensions-informers ok
[+]poststarthook/start-apiextensions-controllers ok
[+]poststarthook/crd-discovery-available ok
[+]poststarthook/crd-informer-synced ok
[+]poststarthook/bootstrap-controller ok
[+]poststarthook/rbac/bootstrap-roles ok
[+]poststarthook/scheduling/bootstrap-system-priority-classes ok
[+]poststarthook/start-cluster-authentication-info-controller ok
[+]poststarthook/start-kube-apiserver-admission-initializer ok
[+]poststarthook/openshift.io-clientCA-reload ok
[+]poststarthook/openshift.io-requestheader-reload ok
[+]poststarthook/quota.openshift.io-clusterquotamapping ok
[+]poststarthook/openshift.io-kubernetes-informers-synched ok
[+]poststarthook/openshift.io-startkubeinformers ok
[+]poststarthook/aggregator-reload-proxy-client-cert ok
[+]poststarthook/start-kube-aggregator-informers ok
[+]poststarthook/apiservice-registration-controller ok
[+]poststarthook/apiservice-status-available-controller ok
[+]poststarthook/apiservice-wait-for-first-sync ok
[+]poststarthook/kube-apiserver-autoregistration ok
[+]autoregister-completion ok
[+]poststarthook/apiservice-openapi-controller ok
[-]shutdown failed: reason withheld
healthz check failed

The endpoint of readyz will start returning failure as soon as kube-apiserver shutdown is initiated.

Comment 9 errata-xmlrpc 2020-04-20 17:08:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1482