1790989 – Cluster managed daemonsets and deployments reporting not all pods are ready when all pods appear to be running

Bug 1790989 - Cluster managed daemonsets and deployments reporting not all pods are ready when all pods appear to be running [NEEDINFO]

Summary: Cluster managed daemonsets and deployments reporting not all pods are ready w...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	4.2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	4.6.0
Assignee:	Ryan Phillips
QA Contact:	Sunil Choudhary
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-01-14 16:18 UTC by Luke Stanton
Modified:	2024-06-13 22:22 UTC (History)
CC List:	26 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-07-08 17:20:03 UTC
Target Upstream Version:
Embargoed:
Flags:	acomabon: needinfo?

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Knowledge Base (Solution)	4726211	0	None	None	None	2020-01-22 22:03:25 UTC

Description Luke Stanton 2020-01-14 16:18:07 UTC

Description of problem:

When checking the status of cluster managed daemonsets and deployments, some report that not all pods are available. However, it appears that all pods are actually running without issue. Some of these "out of sync" daemonsets/deployments appear to cause their associated operators to go into a degraded state. 

This issue came up without any changes or known activity in the cluster.


How reproducible:

Uncertain


Actual results: 

Some cluster operators reporting as degraded due to the out of sync deployments.


Expected results:

Deployment state would be accurate.

Comment 9 Ryan Phillips 2020-02-04 19:35:28 UTC

Looks like this [1] upstream issue, which is still active.


1. https://github.com/kubernetes/kubernetes/issues/53023

Comment 10 Ryan Phillips 2020-02-04 19:36:46 UTC

Rolling the daemonset seems to mitigate the issue for now.

```
oc rollout restart ds [ds name].
```

Comment 13 W. Trevor King 2020-02-24 21:33:08 UTC

Bug 1804717 might help with this.  Or it will at least maximize the benefit of a fix to Kube's Deployment controller.

Comment 28 Dan Winship 2020-05-14 21:42:24 UTC

1804717 works around the problem for a single DaemonSet but the problem still exists for every other DaemonSet. If we are not going to fix it in kubelet then we need to get rid of every DaemonSet in OCP...

Comment 39 Ryan Phillips 2020-07-08 17:20:03 UTC

There are patches in later releases fixing this issue in 4.2. If this issue is found again in later release, please open a new bug.

Note You need to log in before you can comment on or make changes to this bug.

acomabon
adeshpan
agarcial
aos-bugs
arghosh
dahernan
danw
e30532
eparis
fabian.ahbeck
ggore
jcrumple
jinjli
jmalde
jokerman
knewcome
mvardhan
oarribas
openshift-bugs-escalate
pbertera
rgregory
rkant
rphillips
scuppett
tnozicka
wking