Bug 1904133
Summary: | KubeletConfig flooded with failure conditions | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Artyom <alukiano> |
Component: | Node | Assignee: | Qi Wang <qiwan> |
Node sub component: | Kubelet | QA Contact: | MinLi <minmli> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | medium | ||
Priority: | unspecified | CC: | aos-bugs, fromani, jerzhang, kgarriso, manyayad, mapfelba, msivak, qiwan, tsweeney |
Version: | 4.7 | Keywords: | Reopened, UpcomingSprint |
Target Milestone: | --- | ||
Target Release: | 4.7.0 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause: Failed to check the condition message leads to the failure condition with the same message appearing every minute.
Fix: Do not add failure condition if the failure message stays the same.
Result: The same error condition appears only once with timestamp updated.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2021-02-24 15:37:28 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Artyom
2020-12-03 16:06:15 UTC
This is controlled by the kubeletconfigcontroller, which today adds the error to the queue directly I believe. Other sub-controllers should be updating the error state instead. Passing off to the node team to take a look. there is a similar bug in 4.6 which has fixed: https://bugzilla.redhat.com/show_bug.cgi?id=1849538 The results: ``` Status: Conditions: Last Transition Time: 2021-01-13T19:48:16Z Message: Error: could not find any MachineConfigPool set for KubeletConfig Status: False Type: Failure Events: <none> ``` Client Version: v4.2.0-alpha.0-930-geeb9d6d Server Version: 4.7.0-0.ci-2021-01-13-131322 Kubernetes Version: v1.20.0-983+31b56ef6b1cf67-dirty The fix https://github.com/openshift/machine-config-operator/pull/1859 for 4.6 should have fixed this bug. Close this one since it's a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1849538. *** This bug has been marked as a duplicate of bug 1849538 *** I still can see the bug on oc version Client Version: 4.6.0-0.nightly-2020-07-25-091217 Server Version: 4.7.0-fc.3 Kubernetes Version: v1.20.0+d9c52cc Status: Conditions: Last Transition Time: 2021-01-25T10:32:33Z Message: Error: could not find any MachineConfigPool set for KubeletConfig Status: False Type: Failure Last Transition Time: 2021-01-25T10:33:33Z Message: Error: could not find any MachineConfigPool set for KubeletConfig Status: False Type: Failure Last Transition Time: 2021-01-25T10:34:33Z Message: Error: could not find any MachineConfigPool set for KubeletConfig Status: False Type: Failure Last Transition Time: 2021-01-25T10:35:33Z Message: Error: could not find any MachineConfigPool set for KubeletConfig Status: False Type: Failure Events: <none> I will check whether the fix is included in that MCO version. apiVersion: machineconfiguration.openshift.io/v1 kind: KubeletConfig metadata: name: set-max-pods spec: logLevel: 5 machineConfigPoolSelector: matchLabels: custom-kubelet: small-pods kubeletConfig: maxPods: 100 Apply the above configuration $ oc create -f kubeconfig.yaml kubeletconfig.machineconfiguration.openshift.io/set-max-pods created $ oc describe kubeletconfig.machineconfiguration.openshift.io/set- Status: Conditions: Last Transition Time: 2021-01-25T17:05:04Z Message: Error: could not find any MachineConfigPool set for KubeletConfig Status: False Type: Failure Events: <none> $ oc version Client Version: v4.2.0-alpha.0-930-geeb9d6d Server Version: 4.7.0-fc.3 Kubernetes Version: v1.20.0+d9c52cc @alukiano I failed to reproduce this bug. Can I have more details about the steps to reproduce? How long did you wait, a new condition appears every minute. I applied your config and got the same problem Conditions: Last Transition Time: 2021-01-26T15:41:09Z Message: Error: could not find any MachineConfigPool set for KubeletConfig Status: False Type: Failure Last Transition Time: 2021-01-26T15:42:09Z Message: Error: could not find any MachineConfigPool set for KubeletConfig Status: False Type: Failure .... Thanks, I saw the conditions after several minutes (In reply to Artyom from comment #9) > How long did you wait, a new condition appears every minute. I applied your > config and got the same problem > Conditions: > Last Transition Time: 2021-01-26T15:41:09Z > Message: Error: could not find any MachineConfigPool set > for KubeletConfig > Status: False > Type: Failure > Last Transition Time: 2021-01-26T15:42:09Z > Message: Error: could not find any MachineConfigPool set > for KubeletConfig > Status: False > Type: Failure > .... verified on version : 4.7.0-0.nightly-2021-01-31-031653 the failure condition updates every minute, and only show one item in kubeletconfig description status: conditions: - lastTransitionTime: "2021-02-01T09:52:00Z" message: 'Error: could not find any MachineConfigPool set for KubeletConfig' status: "False" type: Failure ... status: conditions: - lastTransitionTime: "2021-02-01T09:53:10Z" message: 'Error: could not find any MachineConfigPool set for KubeletConfig' status: "False" type: Failure Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633 |