Description of problem: Node reservation argument should not be negative or invalid value. Version-Release number of selected component (if applicable): openshift v3.2.0.6 kubernetes v1.2.0-36-g4a3f9c5 etcd 2.2.5 How reproducible: Always Steps to Reproduce: 1.Set reservation with negative value. kubeletArguments: system-reserved: - "cpu=-100,memory=-2G" kube-reserved: - "cpu=-100m,memory=-3G" 2.Restart node and check node status allocatable: cpu: 102100m memory: "8975499776" pods: "110" capacity: cpu: "2" memory: 3882324Ki pods: "110" 3.Set reservation with invalid value kubeletArguments: system-reserved: - "cpu=20wx,memory=100Gm" kube-reserved: - "cpu=20wx,memory=100Gm" 4.Restart node and check node status allocatable: cpu: "0" memory: "0" pods: "110" capacity: cpu: "2" memory: 3882324Ki pods: "110" 5.Set reservation > node capacity kubeletArguments: system-reserved: - "cpu=20,memory=100G" kube-reserved: - "cpu=20,memory=100G" 6.Restart node and check node status allocatable: cpu: "0" memory: "0" pods: "110" capacity: cpu: "2" memory: 3882324Ki pods: "110" Actual results: 2.allocatable > capacity 4.allocatable = 0 6.allocatable = 0 Expected results: 2.Should restart node fail and tip the value should not be negative allocatable should be never larger than capacity. 4.Should fail restart node, tip invalid value 6 should fail restart node(the result for step6 may reasonable,not confirm) Additional info:
Based on the regexp for ResourceQuantity, negative values are allowed. I believe there should be a check the resource is non-negative. Let me check the code again.
It is a bug. The quantity is parsed but not checked for negativity. Once set, negative value is subtracted from capacity resulting in addition. Thus, allocatable can grow over node's actual capacity. [1] https://github.com/kubernetes/kubernetes/commit/e2ffd007f7794f06b9a2f39d62280bd974d747c6
For the values with invalid units (wx, Gm): if a unit is not recognized, quantity unit parser automatically returns error. So kubelet should fail with "unable to parse quantity's suffix". The last usecase is correct. Allocated resource for both cpu and memory computed by the formula is negative. Thus set to 0.
For the negative values [1] [1] https://github.com/kubernetes/kubernetes/pull/23379
Merged upstream
Just create unit test for testing invalid resource quantity unit and the parse reports "unable to parse quantity's suffix". go test -v k8s.io/kubernetes/cmd/kubelet/app === RUN TestValueOfAllocatableResources --- PASS: TestValueOfAllocatableResources (0.00s) server_test.go:39: err: "resource quantity for \"memory\" cannot be negative: -150G" server_test.go:39: err: "unable to parse quantity's suffix" PASS ok k8s.io/kubernetes/cmd/kubelet/app 0.020s
Upstream PR here [1] https://github.com/kubernetes/kubernetes/pull/23627
> 3.Set reservation with invalid value > kubeletArguments: > system-reserved: > - "cpu=20wx,memory=100Gm" > kube-reserved: > - "cpu=20wx,memory=100Gm" go test -v k8s.io/kubernetes/cmd/kubelet/app === RUN TestValueOfAllocatableResources --- PASS: TestValueOfAllocatableResources (0.00s) server_test.go:47: Returned err: "resource quantity for \"memory\" cannot be negative: -150G" server_test.go:47: Returned err: "quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$'" PASS ok k8s.io/kubernetes/cmd/kubelet/app 0.329s
Upstream PR has merged. Changing to POST
Merged in the origin master [1]. What is the next step, close it as the current release? [1] https://github.com/openshift/origin/blob/master/Godeps/_workspace/src/k8s.io/kubernetes/cmd/kubelet/app/server.go#L919
(In reply to Jan Chaloupka from comment #11) > Merged in the origin master [1]. What is the next step, close it as the > current release? > > [1] > https://github.com/openshift/origin/blob/master/Godeps/_workspace/src/k8s.io/ > kubernetes/cmd/kubelet/app/server.go#L919 you can set ON_QA, then QE will verify this bug.
This bug is against OSE, not Origin. Set it to MODIFIED and Productization will set it to ON_QA once there's a new puddle with the fix to be tested.
This has been merged into ose and is in OSE v3.3.0.28 or newer.
Verified. openshift v3.3.0.28 kubernetes v1.3.0+507d3a7 etcd 2.3.0+git results: 2. restart node fail 9月 01 23:54:19 qe-wmeng-master-1 atomic-openshift-node[52214]: F0901 23:54:19.260774 52214 start_node.go:126] resource quantity for "cpu" cannot be negative: -100m 4. restart node fail 9月 01 23:48:10 qe-wmeng-master-1 atomic-openshift-node[51413]: F0901 23:48:10.320714 51413 start_node.go:126] quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$' 6. restart node OK Capacity: alpha.kubernetes.io/nvidia-gpu: 0 cpu: 1 memory: 3624520Ki pods: 110 Allocatable: alpha.kubernetes.io/nvidia-gpu: 0 cpu: 0 memory: 0 pods: 110
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1933