Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1541263 - NodeController should update OutOfDisk status to unknown when node service is stopped
NodeController should update OutOfDisk status to unknown when node service is...
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Pod (Show other bugs)
3.9.0
Unspecified Unspecified
medium Severity medium
: ---
: 3.9.0
Assigned To: Seth Jennings
DeShuai Ma
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2018-02-02 01:43 EST by weiwei jiang
Modified: 2018-03-28 10:26 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-03-28 10:25:43 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0489 None None None 2018-03-28 10:26 EDT

  None (edit)
Description weiwei jiang 2018-02-02 01:43:21 EST
Description of problem:
After enabled TaintBasedEvictions and stop node service, got OutOfDisk status not unknown.

Version-Release number of selected component (if applicable):
# openshift version 
openshift v3.9.0-0.34.0
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.8


How reproducible:
always

Steps to Reproduce:
    Given environment has at least 2 schedulable nodes
    Given master config is merged with the following hash:
    """
    kubernetesMasterConfig:
      controllerArguments:
        feature-gates:
        - AllAlpha=true,TaintBasedEvictions=true
    """
    Then the step should succeed
    And the master service is restarted on all master nodes
    Given I use the "<%= cb.nodes[0].name %>" node
    Given I run commands on the host:
      | systemctl stop atomic-openshift-node |
    Then the step should succeed
    Given I wait for the steps to pass:
    """
    When I run the :describe admin command with:
      | resource | no                      |
      | name     | <%= cb.nodes[0].name %> |
    Then the output should match:
      | Taints:\\s+node(.alpha)?.kubernetes.io\/unreachable:NoExecute |
    """


Actual results:
Conditions:
  Type             Status    LastHeartbeatTime                 LastTransitionTime                Reason                     Message
  ----             ------    -----------------                 ------------------                ------                     -------
  OutOfDisk        False     Fri, 02 Feb 2018 06:17:03 +0000   Thu, 01 Feb 2018 06:01:45 +0000   KubeletHasSufficientDisk   kubelet has sufficient disk space available
  MemoryPressure   Unknown   Fri, 02 Feb 2018 06:17:03 +0000   Fri, 02 Feb 2018 06:17:45 +0000   NodeStatusUnknown          Kubelet stopped posting node status.
  DiskPressure     Unknown   Fri, 02 Feb 2018 06:17:03 +0000   Fri, 02 Feb 2018 06:17:45 +0000   NodeStatusUnknown          Kubelet stopped posting node status.
  Ready            Unknown   Fri, 02 Feb 2018 06:17:03 +0000   Fri, 02 Feb 2018 06:17:45 +0000   NodeStatusUnknown          Kubelet stopped posting node status.


Expected results:
OutOfDisk status should be updated to Unknown

Additional info:
Comment 1 Seth Jennings 2018-02-02 14:01:16 EST
Looks like it was originally correct:
https://github.com/kubernetes/kubernetes/pull/48983/files#diff-209cdbcfa0c2a92c871efbfdb362860bR960

This a PR removed it:
https://github.com/kubernetes/kubernetes/pull/51294/files#diff-a44cd14a0381ca49bcbaaf967b3cc9b0L988

I think this just got caught up in a sweep to remove OutOfDisk from controllers and no one realized that we still needed this until kube 1.10+ when it is planned to be removed entirely.
Comment 2 Seth Jennings 2018-02-02 14:22:56 EST
Kube PR:
https://github.com/kubernetes/kubernetes/pull/59279
Comment 3 Seth Jennings 2018-02-02 14:36:34 EST
Origin PR:
https://github.com/openshift/origin/pull/18417
Comment 5 weiwei jiang 2018-02-22 00:10:21 EST
Checked with 
# openshift version 
openshift v3.9.0-0.47.0
kubernetes v1.9.1+a0ce1bc657
etcd 3.2.8

and can not reproduced this now.
Comment 8 errata-xmlrpc 2018-03-28 10:25:43 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489

Note You need to log in before you can comment on or make changes to this bug.