Bug 1891460 - set invalid value for evictionHard and evictionSoft parameters in kubeletconfig should prompt error
Summary: set invalid value for evictionHard and evictionSoft parameters in kubeletconf...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.7
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: 4.8.0
Assignee: Harshal Patil
QA Contact: MinLi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-26 10:55 UTC by MinLi
Modified: 2021-01-28 03:48 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-01-28 03:48:39 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 2182 0 None closed Bug 1891460: kubelet: add eviction hard validation 2021-02-03 08:23:22 UTC
Github openshift machine-config-operator pull 2314 0 None closed Bug 1891460: KubeletConfig: Allow only positive values for KubeReserved, SystemReserved, EvictionHard and EvictionSoft 2021-02-03 08:23:22 UTC

Description MinLi 2020-10-26 10:55:58 UTC
Description of problem:
Setting invalid value for evictionHard and evictionSoft parameters in kubeletconfig should prompt error, or else it will cause the kubelet stuck in restart loop. 


Version-Release number of selected component (if applicable):
4.7.0-0.nightly-2020-10-24-155529

How reproducible:
always

Steps to Reproduce:
1.$ oc label mcp worker custom-kubelet=hard-eviction

2.$ oc create -f kubelet-hardeviction.yaml
kubelet-hardeviction.yaml:
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:custom-kubelet-hard-eviction
  name: custom-kubelet-hard-eviction
spec:
  machineConfigPoolSelector:
    matchLabels:
      custom-kubelet: hard-eviction
  kubeletConfig:
    evictionHard:
      memory.available: "0Mi"

3.$ oc get kubeletconfig custom-kubelet-hard-eviction -o yaml 
...
spec:
  kubeletConfig:
    evictionHard:
      memory.available: 0Mi
  machineConfigPoolSelector:
    matchLabels:
      custom-kubelet: hard-eviction
status:
  conditions:
  - lastTransitionTime: "2020-10-26T07:59:45Z"
    message: Success
    status: "True"
    type: Success


Actual results:
3.the kubeletconfig's description show it set succeed.


Expected results:
3.the kubeletconfig's description show it set failed, evictionHard threshold memory.available must be positive.

Additional info:
$ oc get mcp 
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-0e9308d17dbb451f9b60d4c1298aa9f7   True      False      False      3              3                   3                     0                      5h46m
worker   rendered-worker-28c915fa97c5c0fc9e8ae457ee36377e   False     True       False      3              0                   0                     0                      5h46m

$ oc get node 
NAME                                         STATUS                        ROLES    AGE   VERSION
ip-10-0-136-237.us-east-2.compute.internal   Ready                         worker   22h   v1.19.0+d59ce34
ip-10-0-152-247.us-east-2.compute.internal   Ready                         master   22h   v1.19.0+d59ce34
ip-10-0-173-0.us-east-2.compute.internal     Ready                         master   22h   v1.19.0+d59ce34
ip-10-0-186-217.us-east-2.compute.internal   NotReady,SchedulingDisabled   worker   22h   v1.19.0+d59ce34
ip-10-0-194-180.us-east-2.compute.internal   Ready                         master   22h   v1.19.0+d59ce34
ip-10-0-202-245.us-east-2.compute.internal   Ready                         worker   22h   v1.19.0+d59ce34

[core@ip-10-0-186-217 ~]$ systemctl status kubelet
● kubelet.service - Kubernetes Kubelet
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-mco-default-env.conf
   Active: activating (auto-restart) (Result: exit-code) since Tue 2020-09-22 09:43:01 UTC; 5s ago
  Process: 8709 ExecStart=/usr/bin/hyperkube kubelet --config=/etc/kubernetes/kubelet.conf --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig --kubeconfig=/va>
  Process: 8707 ExecStartPre=/bin/rm -f /var/lib/kubelet/cpu_manager_state (code=exited, status=0/SUCCESS)
  Process: 8705 ExecStartPre=/bin/mkdir --parents /etc/kubernetes/manifests (code=exited, status=0/SUCCESS)
 Main PID: 8709 (code=exited, status=255)
      CPU: 419ms


the node kept NotReady status forever,the kubelet was stuck in restart loop, and the log showed:
Oct 26 08:02:03 ip-10-0-186-217 hyperkube[1520]: F1020 08:02:03.460352    1520 server.go:265] failed to run Kubelet: eviction threshold memory.available must be positive: 0
Oct 26 08:02:03 ip-10-0-186-217 systemd[1]: Failed to start Kubernetes Kubelet.

Comment 1 MinLi 2020-10-26 11:01:34 UTC
I also tried other invalid values including negative, non-digital, and value greater than memory capacity, and lead to the same problem.

Comment 3 MinLi 2020-11-11 02:43:00 UTC
reproduce on version:  4.7.0-0.nightly-2020-11-09-235738

Comment 7 MinLi 2021-01-06 12:16:38 UTC
Hi, Harshal Patil

non-digital value means string, such as "&#jk789", "89hu*7.8", I think they are also invalid value.

Comment 10 MinLi 2021-01-12 07:38:43 UTC
not fixed in version 4.7.0-0.nightly-2021-01-10-070949, I found a few errors as below:

1) kubelet think "imagefs.inodesFree" is unknown resource defined in evictionHard
spec:
  kubeletConfig:
    evictionHard:
      imagefs.available: 20Gi
      imagefs.inodesFree: 5%
      memory.available: 0Mi
      nodefs.available: 5%
      nodefs.inodesFree: 4%
    evictionPressureTransitionPeriod: 5s
    imageGCHighThresholdPercent: 80
    imageGCLowThresholdPercent: 75
    imageMinimumGCAge: 5m
    maxPods: 240
    podsPerCore: 80
  machineConfigPoolSelector:
    matchLabels:
      custom-kubelet: small-pods
status:
  conditions:
  - lastTransitionTime: "2021-01-12T07:05:34Z"
    message: 'Error: KubeletConfiguration: unknown resource imagefs.inodesFree defined
      in evictionHard'
    status: "False"
    type: Failure

2) it should prompt "imagefs.available" invalid , not "nodefs.available"
spec:
  kubeletConfig:
    evictionHard:
      imagefs.available: '*asd9Gi'
      memory.available: 300Mi
      nodefs.available: 5%
      nodefs.inodesFree: 4%
    evictionPressureTransitionPeriod: 5s
    imageGCHighThresholdPercent: 80
    imageGCLowThresholdPercent: 75
    imageMinimumGCAge: 5m
    maxPods: 240
    podsPerCore: 80
  machineConfigPoolSelector:
    matchLabels:
      custom-kubelet: small-pods
status:
  conditions:
  - lastTransitionTime: "2021-01-12T07:18:44Z"
    message: 'Error: KubeletConfiguration: invalid value specified for nodefs.available
      reservation in evictionHard, 5%'
    status: "False"
    type: Failure

3)it should prompt "nodefs.inodesFree: 0%" invalid , not "nodefs.available: 5%"
spec:
  kubeletConfig:
    evictionHard:
      imagefs.available: 9Gi
      memory.available: 300Mi
      nodefs.available: 5%
      nodefs.inodesFree: 0%
    evictionPressureTransitionPeriod: 5s
    imageGCHighThresholdPercent: 80
    imageGCLowThresholdPercent: 75
    imageMinimumGCAge: 5m
    maxPods: 240
    podsPerCore: 80
  machineConfigPoolSelector:
    matchLabels:
      custom-kubelet: small-pods
status:
  conditions:
  - lastTransitionTime: "2021-01-12T07:21:54Z"
    message: 'Error: KubeletConfiguration: invalid value specified for nodefs.available
      reservation in evictionHard, 5%'
    status: "False"
    type: Failure


Also the evictionSoft has the similar issue.


Note You need to log in before you can comment on or make changes to this bug.