1926731 – machine-config-daemon pod restarted takes (number of nodes)*10min during upgrading from 4.7-> 4.7

Bug 1926731 - machine-config-daemon pod restarted takes (number of nodes)*10min during upgrading from 4.7-> 4.7

Summary: machine-config-daemon pod restarted takes (number of nodes)*10min during upgr...

Keywords:
Status:	CLOSED DUPLICATE of bug 1927041
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Machine Config Operator
Sub Component:
Version:	4.7
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Antonio Murdaca
QA Contact:	Michael Nguyen
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-02-09 11:05 UTC by jima
Modified:	2021-02-10 20:15 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-02-10 20:15:49 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description jima 2021-02-09 11:05:04 UTC

Description of problem:
Fresh install cluster with 4.7.0-0.nightly-2021-02-05-221250, then upgrade to 4.7.0-0.nightly-2021-02-06-084550, we found that it took about 1 hour to update machine-config operator, after deep investigate and found that machine-config-daemon pod will be restarted before updating mc, and each daemon pod takes about 10min to restart in sequence, which reached terminationGracePeriodSeconds(=600) defined in daemonset.

# for i in $(oc get po -o name|grep daemon); do oc logs $i -c machine-config-daemon|head -n1; done
I0208 04:02:11.334575  289968 start.go:108] Version: v4.7.0-202102060108.p0-dirty (0023e696058bbdf6e14504117bfc31f208125c47)
I0208 03:41:47.107426  125554 start.go:108] Version: v4.7.0-202102060108.p0-dirty (0023e696058bbdf6e14504117bfc31f208125c47)
I0208 03:52:01.885962  144491 start.go:108] Version: v4.7.0-202102060108.p0-dirty (0023e696058bbdf6e14504117bfc31f208125c47)
I0208 03:31:35.627618  112590 start.go:108] Version: v4.7.0-202102060108.p0-dirty (0023e696058bbdf6e14504117bfc31f208125c47)
I0208 04:12:21.641158  148829 start.go:108] Version: v4.7.0-202102060108.p0-dirty (0023e696058bbdf6e14504117bfc31f208125c47)

Since we have 3 master + 2 worker, total time used to restart mcd pod is 5*10min. 

Although the upgrade is successful finally, it takes more than 100min.

Then continue to upgrade on this cluster, issue is not reproduced any more. mcd pods restart quickly in less than 2min.

We also tried to update mcd daemonset on fresh installed cluster with 4.7 nightly build, to let mcd pods be restarted, then upgrade to another 4.7 nightly build on this cluster, not hit the issue.


Version-Release number of selected component (if applicable):

How reproducible:
Always when fresh install cluster with 4.7 nightly build, then upgrade to another 4.7 nightly build

Steps to Reproduce:
1. Fresh install upi cluster with 4.7.0-0.nightly-2021-02-05-221250
2. upgrade to 4.7.0-0.nightly-2021-02-06-084550
3. hit the issue, each mcd pod takes 10min for restarting during upgrade

Actual results:
Each mcd pod takes 10min for restarting during upgrade

Expected results:
mcd pod should be restarted quickly

Additional info:
Issue is only reproduced on the fresh installation 4.7 cluster, then upgrade to another 4.7 nightly build.

Comment 2 Kirsten Garrison 2021-02-10 20:15:49 UTC


*** This bug has been marked as a duplicate of bug 1927041 ***

Note You need to log in before you can comment on or make changes to this bug.