Description of problem: Error updating node status when set resource reservation larger then node capacity Version-Release number of selected component (if applicable): openshift v3.6.84 kubernetes v1.6.1+5115d708d7 How reproducible: Always Steps to Reproduce: 1. Set reservation value > [Node Capacity], and restart atomic-openshift-node service successfully. kubeletArguments: system-reserved: - "cpu=200,memory=1000G" kube-reserved: - "cpu=200,memory=1000G" 2. Check node status by `oc get node` `and oc describe node` Actual results: 2. Check node status by `oc get node` `and oc describe node`, found node NotReady and Allocatable value didn't change. [root@host-8-175-81 ~]# oc get node host-8-175-189.host.centralci.eng.rdu2.redhat.com NAME STATUS AGE VERSION host-8-175-189.host.centralci.eng.rdu2.redhat.com NotReady 1h v1.6.1+5115d708d7 Name: host-8-175-189.host.centralci.eng.rdu2.redhat.com Role: Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/hostname=host-8-175-189.host.centralci.eng.rdu2.redhat.com registry=enabled role=node router=enabled Annotations: volumes.kubernetes.io/controller-managed-attach-detach=true Taints: <none> CreationTimestamp: Wed, 24 May 2017 21:48:42 -0400 Phase: Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- OutOfDisk Unknown Wed, 24 May 2017 23:13:19 -0400 Wed, 24 May 2017 23:14:01 -0400 NodeStatusUnknown Kubelet stopped posting node status. MemoryPressure Unknown Wed, 24 May 2017 23:13:19 -0400 Wed, 24 May 2017 23:14:01 -0400 NodeStatusUnknown Kubelet stopped posting node status. DiskPressure Unknown Wed, 24 May 2017 23:13:19 -0400 Wed, 24 May 2017 23:14:01 -0400 NodeStatusUnknown Kubelet stopped posting node status. Ready Unknown Wed, 24 May 2017 23:13:19 -0400 Wed, 24 May 2017 23:14:01 -0400 NodeStatusUnknown Kubelet stopped posting node status. Addresses: 10.8.175.189,10.8.175.189,host-8-175-189.host.centralci.eng.rdu2.redhat.com Capacity: cpu: 2 memory: 3881920Ki pods: 250 Allocatable: cpu: 2 memory: 3779520Ki pods: 250 System Info: Machine ID: 1754a4957d8a442c8d2362df57fa5626 System UUID: E9A93021-F312-4F49-B47D-6488A09656B8 Boot ID: 479bd1fc-9ca4-4da7-92d4-4d143d437e0a Kernel Version: 3.10.0-514.10.2.el7.x86_64 OS Image: Red Hat Enterprise Linux Server 7.3 (Maipo) Operating System: linux Architecture: amd64 Container Runtime Version: docker://1.12.6 Kubelet Version: v1.6.1+5115d708d7 Kube-Proxy Version: v1.6.1+5115d708d7 ExternalID: host-8-175-189.host.centralci.eng.rdu2.redhat.com Non-terminated Pods: (4 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits --------- ---- ------------ ---------- --------------- ------------- default docker-registry-1-5rftx 100m (5%) 0 (0%) 256Mi (6%) 0 (0%) default router-1-gc006 100m (5%) 0 (0%) 256Mi (6%) 0 (0%) install-test cakephp-mysql-example-1-q0crb 0 (0%) 0 (0%) 512Mi (13%) 512Mi (13%) install-test mysql-1-zb4jr 0 (0%) 0 (0%) 512Mi (13%) 512Mi (13%) Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) CPU Requests CPU Limits Memory Requests Memory Limits ------------ ---------- --------------- ------------- 200m (10%) 0 (0%) 1536Mi (41%) 1Gi (27%) Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 1h 1h 1 kubelet, host-8-175-189.host.centralci.eng.rdu2.redhat.com Normal Starting Starting kubelet. 1h 1h 1 kubelet, host-8-175-189.host.centralci.eng.rdu2.redhat.com Warning ImageGCFailed unable to find data for container / 1h 1h 2 kubelet, host-8-175-189.host.centralci.eng.rdu2.redhat.com Normal NodeHasSufficientDisk Node host-8-175-189.host.centralci.eng.rdu2.redhat.com status is now: NodeHasSufficientDisk 1h 1h 2 kubelet, host-8-175-189.host.centralci.eng.rdu2.redhat.com Normal NodeHasSufficientMemory Node host-8-175-189.host.centralci.eng.rdu2.redhat.com status is now: NodeHasSufficientMemory 1h 1h 2 kubelet, host-8-175-189.host.centralci.eng.rdu2.redhat.com Normal NodeHasNoDiskPressure Node host-8-175-189.host.centralci.eng.rdu2.redhat.com status is now: NodeHasNoDiskPressure 1h 1h 1 kubelet, host-8-175-189.host.centralci.eng.rdu2.redhat.com Normal NodeReady Node host-8-175-189.host.centralci.eng.rdu2.redhat.com status is now: NodeReady 58m 58m 1 kubelet, host-8-175-189.host.centralci.eng.rdu2.redhat.com Normal Starting Starting kubelet. 58m 58m 1 kubelet, host-8-175-189.host.centralci.eng.rdu2.redhat.com Warning ImageGCFailed unable to find data for container / Expected results: 2. Check node status by `oc get node` `and oc describe node`, node should be ready status and both cpu and memory of allocatable should be "0". Such as: Capacity: alpha.kubernetes.io/nvidia-gpu: 0 cpu: 2 memory: 3881920Ki pods: 250 Allocatable: alpha.kubernetes.io/nvidia-gpu: 0 cpu: 0 memory: 0 pods: 250 addition info: 1. Did comparing test on oc v3.5.5.15, can get expected results with above same reproduce steps. 2. Below is the detail error in logs on OCP3.6: May 25 01:39:19 host-8-175-189 journal: E0525 01:39:19.594371 22779 kubelet_node_status.go:357] Error updating node status, will retry: failed to patch status "{\"status\":{\"allocatable\":{\"cpu\":\"-398\",\"memory\":\"-1949345480Ki\"},\"conditions\":[{\"lastHeartbeatTime\":\"2017-05-25T05:39:19Z\",\"lastTransitionTime\":\"2017-05-25T05:39:19Z\",\"message\":\"kubelet has no disk pressure\",\"reason\":\"KubeletHasNoDiskPressure\",\"status\":\"False\",\"type\":\"DiskPressure\"},{\"lastHeartbeatTime\":\"2017-05-25T05:39:19Z\",\"lastTransitionTime\":\"2017-05-25T05:39:19Z\",\"message\":\"kubelet has sufficient memory available\",\"reason\":\"KubeletHasSufficientMemory\",\"status\":\"False\",\"type\":\"MemoryPressure\"},{\"lastHeartbeatTime\":\"2017-05-25T05:39:19Z\",\"lastTransitionTime\":\"2017-05-25T05:39:19Z\",\"message\":\"kubelet has sufficient disk space available\",\"reason\":\"KubeletHasSufficientDisk\",\"status\":\"False\",\"type\":\"OutOfDisk\"},{\"lastHeartbeatTime\":\"2017-05-25T05:39:19Z\",\"lastTransitionTime\":\"2017-05-25T05:39:19Z\",\"message\":\"kubelet is posting ready status\",\"reason\":\"KubeletReady\",\"status\":\"True\",\"type\":\"Ready\"}]}}" for node "host-8-175-189.host.centralci.eng.rdu2.redhat.com": Node "host-8-175-189.host.centralci.eng.rdu2.redhat.com" is invalid: [status.allocatable.cpu: Invalid value: "-398": must be greater than or equal to 0, status.allocatable.memory: Invalid value: "-1949345480Ki": must be greater than or equal to 0]
Upstream PR: https://github.com/kubernetes/kubernetes/pull/46516
Origin PR: https://github.com/openshift/origin/pull/14379
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1716