Bug 1790989 - Cluster managed daemonsets and deployments reporting not all pods are ready when all pods appear to be running [NEEDINFO]
Summary: Cluster managed daemonsets and deployments reporting not all pods are ready w...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.6.0
Assignee: Ryan Phillips
QA Contact: Sunil Choudhary
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-01-14 16:18 UTC by Luke Stanton
Modified: 2024-06-13 22:22 UTC (History)
26 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-08 17:20:03 UTC
Target Upstream Version:
Embargoed:
acomabon: needinfo?


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 4726211 0 None None None 2020-01-22 22:03:25 UTC

Description Luke Stanton 2020-01-14 16:18:07 UTC
Description of problem:

When checking the status of cluster managed daemonsets and deployments, some report that not all pods are available. However, it appears that all pods are actually running without issue. Some of these "out of sync" daemonsets/deployments appear to cause their associated operators to go into a degraded state. 

This issue came up without any changes or known activity in the cluster.


How reproducible:

Uncertain


Actual results: 

Some cluster operators reporting as degraded due to the out of sync deployments.


Expected results:

Deployment state would be accurate.

Comment 9 Ryan Phillips 2020-02-04 19:35:28 UTC
Looks like this [1] upstream issue, which is still active.


1. https://github.com/kubernetes/kubernetes/issues/53023

Comment 10 Ryan Phillips 2020-02-04 19:36:46 UTC
Rolling the daemonset seems to mitigate the issue for now.

```
oc rollout restart ds [ds name].
```

Comment 13 W. Trevor King 2020-02-24 21:33:08 UTC
Bug 1804717 might help with this.  Or it will at least maximize the benefit of a fix to Kube's Deployment controller.

Comment 28 Dan Winship 2020-05-14 21:42:24 UTC
1804717 works around the problem for a single DaemonSet but the problem still exists for every other DaemonSet. If we are not going to fix it in kubelet then we need to get rid of every DaemonSet in OCP...

Comment 39 Ryan Phillips 2020-07-08 17:20:03 UTC
There are patches in later releases fixing this issue in 4.2. If this issue is found again in later release, please open a new bug.


Note You need to log in before you can comment on or make changes to this bug.