Bug 2088726
| Summary: | oc is not reporting pods with status NodeLost for Daemonset pods when a node is marked NotReady | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Christian Passarelli <cpassare> |
| Component: | kube-controller-manager | Assignee: | Filip Krepinsky <fkrepins> |
| Status: | CLOSED WONTFIX | QA Contact: | zhou ying <yinzhou> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.10 | CC: | fkrepins, maszulik, mfojtik |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-09-22 18:42:39 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Christian Passarelli
2022-05-20 10:33:04 UTC
How long it takes for those pods to be reporting something else than Running, the problem is that by default pods won't be evicted from unreachable node for at least 5 minutes (see description below the table in https://kubernetes.io/docs/concepts/architecture/nodes/#condition). *** Bug 2088727 has been marked as a duplicate of this bug. *** From my and customer tests, they never change the status from Running. And this is expected because of the phase of the pod remaining Running. But I noticed that in 3.11 oc reports a NodeLost status. Filip can you check what we report and how accurate that is? This works fine for normal pods as they get their Ready condition updated after the node becomes unreachable and are evicted after pod-eviction-timeout (as mentioned in documentation above). This will result in oc/kubectl changing their STATUS to Terminating.
Daemon set pods are special since they are not evicted and can keep "running" on the node even if it becomes unreachable. This is achieved by the following tollerations and eviction logic:
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
As also documented in the code:
// DaemonSet pods shouldn't be deleted by NodeController in case of node problems.
// Add infinite toleration for taint unreachable:NoExecute here
// to survive taint-based eviction enforced by NodeController
// when node turns unreachable.
Workaround as suggested by k8s docs:
In cases where Kubernetes cannot deduce from the underlying infrastructure if a node has permanently left a cluster, the cluster administrator may need to delete the node object by hand. Deleting the node object from Kubernetes causes all the Pod objects running on the node to be deleted from the API server and frees up their names.
Kubectl also reports Running status for these daemon set pods - which is just status.phase of the pod, so I am inclined to keep the same behaviour in oc as well.
I am not sure if this (kubectl reporting NodeLost) is worth pursuing in upstream, as especially with server side printint might be difficult to get in and might get pushed to rather do changes in API.
This new feature would be only a secondary indicator as node going down (Not Ready) should be catched by alerting in the first place. Also other option is to observe the node and/or the Ready condition of the pod with other means.
IMO it is not a good idea to start customizing the pod columns in kubectl.
@maszulik thoughts on this? ^
I agree with Filip's statement above, wrt keeping both kubectl and oc consistent in that matter. It would help to better understand the problem at hand to describe what kind of DaemonSet that is and what the customer expectation are. > I would suggest to have a status like "disconnected" or "unknown" as on a node with malfunction or bad firewall rule the pod still can work ...
nitpick: with faulty network we can not be sure if it is working correctly even if it is running
I am still inclined to not to pursue this and close as wontfix this since it has a minor impact when compared to the node being down.
(In reply to Filip Krepinsky from comment #9) > > I would suggest to have a status like "disconnected" or "unknown" as on a node with malfunction or bad firewall rule the pod still can work ... > > nitpick: with faulty network we can not be sure if it is working correctly > even if it is running > > I am still inclined to not to pursue this and close as wontfix this since it > has a minor impact when compared to the node being down. agree, explain the workarounds and the impact and feel free to close. The main problem of node being down/unreachable should be catched by alerting and resolved manually as suggested in https://bugzilla.redhat.com/show_bug.cgi?id=2088726#c5. The wrong / unknown status of daemon set pods is only secondary to this and has a minor impact. |