Bug 1803750

Summary: Improve NodeControllerDegraded condition messages
Product: OpenShift Container Platform Reporter: Michal Fojtik <mfojtik>
Component: kube-apiserverAssignee: Michal Fojtik <mfojtik>
Status: CLOSED ERRATA QA Contact: Ke Wang <kewang>
Severity: high Docs Contact:
Priority: high    
Version: 4.2.0CC: aos-bugs, mfojtik, xxia
Target Milestone: ---   
Target Release: 4.2.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1803748 Environment:
Last Closed: 2020-03-04 04:51:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1803746, 1803748    
Bug Blocks:    

Comment 3 Xingxing Xia 2020-02-20 06:13:59 UTC
Ke Wang, filed "Depends On" has verified bugs for quick reference. Thank you.

Comment 4 Ke Wang 2020-02-20 10:12:02 UTC
Verified with 4.2.0-0.nightly-2020-02-20-040308

Steps as below:
$ oc get no
...
ke-qb6r9-m-0.c.openshift-qe.internal         Ready      master   39m   v1.14.6+47933cbcc
...

# shutdown a master to let it displays NotReady
 oc debug no/ke-qb6r9-m-0.c.openshift-qe.internal -- chroot /host shutdown -h now
Starting pod/ke-qb6r9-m-0copenshift-qeinternal-debug ...
To use host binaries, run `chroot /host`
^C

$ oc get no
...
ke-qb6r9-m-0.c.openshift-qe.internal         NotReady   master   39m   v1.14.6+47933cbcc
...

# check NodeControllerDegraded has time, reason and message from node YAML
$ oc get kubeapiserver cluster -o yaml
...
  conditions:
  - lastTransitionTime: "2020-02-20T09:57:21Z"
    message: 'The master nodes not ready: node "ke-qb6r9-m-0.c.openshift-qe.internal"
      not ready since 2020-02-20 09:53:58 +0000 UTC because NodeStatusUnknown (Kubelet
      stopped posting node status.)'
    reason: MasterNodesReady
    status: "True"
    type: NodeControllerDegraded

oc get co/kube-apiserver -o yaml
...
  conditions:
  - lastTransitionTime: "2020-02-20T09:25:43Z"
    message: 'NodeControllerDegraded: The master nodes not ready: node "ke-qb6r9-m-0.c.openshift-qe.internal"
      not ready since 2020-02-20 09:53:58 +0000 UTC because NodeStatusUnknown (Kubelet
      stopped posting node status.)'
    reason: AsExpected
    status: "False"
    type: Degraded


oc get no ip-10-0-143-9.ap-northeast-1.compute.internal -o yaml
...
conditions:
...
 - lastHeartbeatTime: "2020-02-20T09:52:37Z"
    lastTransitionTime: "2020-02-20T09:53:58Z"
    message: Kubelet stopped posting node status.
    reason: NodeStatusUnknown
    status: Unknown
    type: Ready
...

Comment 6 errata-xmlrpc 2020-03-04 04:51:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0614