Bug 1891460
| Summary: | set invalid value for evictionHard and evictionSoft parameters in kubeletconfig should prompt error | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | MinLi <minmli> |
| Component: | Node | Assignee: | Harshal Patil <harpatil> |
| Node sub component: | Kubelet | QA Contact: | MinLi <minmli> |
| Status: | CLOSED WONTFIX | Docs Contact: | |
| Severity: | medium | ||
| Priority: | unspecified | CC: | aos-bugs, harpatil, jokerman, tsweeney |
| Version: | 4.7 | Keywords: | UpcomingSprint |
| Target Milestone: | --- | ||
| Target Release: | 4.8.0 | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-01-28 03:48:39 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
I also tried other invalid values including negative, non-digital, and value greater than memory capacity, and lead to the same problem. reproduce on version: 4.7.0-0.nightly-2020-11-09-235738 Hi, Harshal Patil non-digital value means string, such as "&#jk789", "89hu*7.8", I think they are also invalid value. not fixed in version 4.7.0-0.nightly-2021-01-10-070949, I found a few errors as below:
1) kubelet think "imagefs.inodesFree" is unknown resource defined in evictionHard
spec:
kubeletConfig:
evictionHard:
imagefs.available: 20Gi
imagefs.inodesFree: 5%
memory.available: 0Mi
nodefs.available: 5%
nodefs.inodesFree: 4%
evictionPressureTransitionPeriod: 5s
imageGCHighThresholdPercent: 80
imageGCLowThresholdPercent: 75
imageMinimumGCAge: 5m
maxPods: 240
podsPerCore: 80
machineConfigPoolSelector:
matchLabels:
custom-kubelet: small-pods
status:
conditions:
- lastTransitionTime: "2021-01-12T07:05:34Z"
message: 'Error: KubeletConfiguration: unknown resource imagefs.inodesFree defined
in evictionHard'
status: "False"
type: Failure
2) it should prompt "imagefs.available" invalid , not "nodefs.available"
spec:
kubeletConfig:
evictionHard:
imagefs.available: '*asd9Gi'
memory.available: 300Mi
nodefs.available: 5%
nodefs.inodesFree: 4%
evictionPressureTransitionPeriod: 5s
imageGCHighThresholdPercent: 80
imageGCLowThresholdPercent: 75
imageMinimumGCAge: 5m
maxPods: 240
podsPerCore: 80
machineConfigPoolSelector:
matchLabels:
custom-kubelet: small-pods
status:
conditions:
- lastTransitionTime: "2021-01-12T07:18:44Z"
message: 'Error: KubeletConfiguration: invalid value specified for nodefs.available
reservation in evictionHard, 5%'
status: "False"
type: Failure
3)it should prompt "nodefs.inodesFree: 0%" invalid , not "nodefs.available: 5%"
spec:
kubeletConfig:
evictionHard:
imagefs.available: 9Gi
memory.available: 300Mi
nodefs.available: 5%
nodefs.inodesFree: 0%
evictionPressureTransitionPeriod: 5s
imageGCHighThresholdPercent: 80
imageGCLowThresholdPercent: 75
imageMinimumGCAge: 5m
maxPods: 240
podsPerCore: 80
machineConfigPoolSelector:
matchLabels:
custom-kubelet: small-pods
status:
conditions:
- lastTransitionTime: "2021-01-12T07:21:54Z"
message: 'Error: KubeletConfiguration: invalid value specified for nodefs.available
reservation in evictionHard, 5%'
status: "False"
type: Failure
Also the evictionSoft has the similar issue.
|
Description of problem: Setting invalid value for evictionHard and evictionSoft parameters in kubeletconfig should prompt error, or else it will cause the kubelet stuck in restart loop. Version-Release number of selected component (if applicable): 4.7.0-0.nightly-2020-10-24-155529 How reproducible: always Steps to Reproduce: 1.$ oc label mcp worker custom-kubelet=hard-eviction 2.$ oc create -f kubelet-hardeviction.yaml kubelet-hardeviction.yaml: apiVersion: machineconfiguration.openshift.io/v1 kind: KubeletConfig metadata:custom-kubelet-hard-eviction name: custom-kubelet-hard-eviction spec: machineConfigPoolSelector: matchLabels: custom-kubelet: hard-eviction kubeletConfig: evictionHard: memory.available: "0Mi" 3.$ oc get kubeletconfig custom-kubelet-hard-eviction -o yaml ... spec: kubeletConfig: evictionHard: memory.available: 0Mi machineConfigPoolSelector: matchLabels: custom-kubelet: hard-eviction status: conditions: - lastTransitionTime: "2020-10-26T07:59:45Z" message: Success status: "True" type: Success Actual results: 3.the kubeletconfig's description show it set succeed. Expected results: 3.the kubeletconfig's description show it set failed, evictionHard threshold memory.available must be positive. Additional info: $ oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-0e9308d17dbb451f9b60d4c1298aa9f7 True False False 3 3 3 0 5h46m worker rendered-worker-28c915fa97c5c0fc9e8ae457ee36377e False True False 3 0 0 0 5h46m $ oc get node NAME STATUS ROLES AGE VERSION ip-10-0-136-237.us-east-2.compute.internal Ready worker 22h v1.19.0+d59ce34 ip-10-0-152-247.us-east-2.compute.internal Ready master 22h v1.19.0+d59ce34 ip-10-0-173-0.us-east-2.compute.internal Ready master 22h v1.19.0+d59ce34 ip-10-0-186-217.us-east-2.compute.internal NotReady,SchedulingDisabled worker 22h v1.19.0+d59ce34 ip-10-0-194-180.us-east-2.compute.internal Ready master 22h v1.19.0+d59ce34 ip-10-0-202-245.us-east-2.compute.internal Ready worker 22h v1.19.0+d59ce34 [core@ip-10-0-186-217 ~]$ systemctl status kubelet ● kubelet.service - Kubernetes Kubelet Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/kubelet.service.d └─10-mco-default-env.conf Active: activating (auto-restart) (Result: exit-code) since Tue 2020-09-22 09:43:01 UTC; 5s ago Process: 8709 ExecStart=/usr/bin/hyperkube kubelet --config=/etc/kubernetes/kubelet.conf --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig --kubeconfig=/va> Process: 8707 ExecStartPre=/bin/rm -f /var/lib/kubelet/cpu_manager_state (code=exited, status=0/SUCCESS) Process: 8705 ExecStartPre=/bin/mkdir --parents /etc/kubernetes/manifests (code=exited, status=0/SUCCESS) Main PID: 8709 (code=exited, status=255) CPU: 419ms the node kept NotReady status forever,the kubelet was stuck in restart loop, and the log showed: Oct 26 08:02:03 ip-10-0-186-217 hyperkube[1520]: F1020 08:02:03.460352 1520 server.go:265] failed to run Kubelet: eviction threshold memory.available must be positive: 0 Oct 26 08:02:03 ip-10-0-186-217 systemd[1]: Failed to start Kubernetes Kubelet.